1

I can't store embeddings to postgresql database.

I am computing embeddings in Java using Open AI api.

EmbeddingResult embeddingResult = openAIService.createEmbedding(embeddingInput);
List<Double> embeddingsList = embeddingResult.getData().get(0).getEmbedding();

It returns me List<Double>.

Now I need to store it to Postgres database. I need to store it to one column, as a vector(1536).

enter image description here

I tried to create my own class – AttributeConverter so that Hibernate can handle List<Double> as a PostgreSQL array/vector.

@Converter(autoApply = true)
public class ListToDoubleArrayConverter implements AttributeConverter<List<Double>, Double[]> {

    @Override
    public Double[] convertToDatabaseColumn(List<Double> attribute) {
        if (attribute == null || attribute.isEmpty()) {
            return new Double[0];
        }
        return attribute.toArray(new Double[attribute.size()]);
    }

    @Override
    public List<Double> convertToEntityAttribute(Double[] dbData) {
        if (dbData == null || dbData.length == 0) {
            return new ArrayList<>();
        }
        return Arrays.asList(dbData);
    }
}

And use it

@Entity
@Data
@Table(name = "explicit_questions")
@EntityListeners(AuditingEntityListener.class)
public class ExplicitQuestion implements IBaseModel {

    ...

    @Convert(converter = ListToDoubleArrayConverter.class)
    private List<Double> embedding;

I got: o.h.engine.jdbc.spi.SqlExceptionHelper : ERROR: column "embedding" is of type vector but expression is of type bytea Hint: You will need to rewrite or cast the expression.


I tried

@Column(columnDefinition = "jsonb")
@Type(type = "jsonb")
private List<Double> embedding;

I was able to store the embedding, but I wasn't able to fetch.


I also tried define my own type.

@TypeDef(
    name = "double-array",
    typeClass = DoubleArrayType.class
)
public class ExplicitQuestion... {

    @Type(type = "double-array")
    @Column(
        name = "embeddings", 
        columnDefinition = "double precision[]"
    )
    private List<Double> embeddings;

}

But nothing works.

I use PagingAndSortingRepository from spring for auto generating SQLs. I also tried to write my own:

@Query(value = "UPDATE explicit_questions set embedding = :embedding where id = :id", nativeQuery = true)
void createEmbedding(@Param("id") Long id, @Param("embedding") Double[] embedding);

I know I can store the embeddings to separate table, but then it is too slow.

1 Answer 1

0

I found the solution here What JPA + Hibernate data type should I use to support the vector extension in a PostgreSQL database?

import com.fasterxml.jackson.annotation.JsonInclude;
import io.hypersistence.utils.hibernate.type.json.JsonType;
import org.hibernate.annotations.Type;
import org.hibernate.annotations.TypeDef;
import javax.persistence.*;

@Entity
@JsonInclude(JsonInclude.Include.NON_NULL)
@TypeDef(name = "json", typeClass = JsonType.class)
public class ExplicitQuestion implements IBaseModel {

    ...

    @Basic
    @Type(type = "json")
    @Column(name = "embedding", columnDefinition = "vector")
    private List<Double> embedding;
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.