0

Update: See the "Update" section below for the latest.

I have been working with Knex.js to build SQL queries in Node.js, and have the following code. This code works on a sort of graph data model (nodes and links), where there is a links table which has everything (links link to links). Given this code, I am wondering how I can make it one query instead of one query per attribute which is how it is now. The getTableName() function returns a string_links table for string values, and <x>_links tables for the other datatypes, while the "basic" links table is just called links.

Essentially how this works is, first query the top level where the parent_id is equal to some "type" ID, say we are querying "user" objects, the type would be "user". So let instance = ... is getting all the instance links from this user type. Then we go through each field of a query (a query for now is just boolean-valued map, like { email: true, name: true }). For each field of the query, we make a query to find all those nodes, linked off the instance, as so-called property links.

There are two types of properties, but don't need to go into too much detail on that. Essentially there are complex properties with audit trails and simple properties without audit trails. That is what is meant by the interactive branch in the logic.

How can I make this into one SQL query? The SQL query it prints out for an example is like this:

select "id" from "links" where "parent_id" = '47c1956bz31330c' and "name" = 'link' limit 1
select "value" from "string_links" where "parent_id" = (select "value" from "links" where "parent_id" = '47c1956bz31330cv' and "name" = 'name' limit 1) and "name" = 'value' limit 1
select "value" from "text_links" where "parent_id" = (select "value" from "links" where "parent_id" = '47c1956bz31330cv' and "name" = 'website' limit 1) and "name" = 'value' limit 1
select "value" from "integer_links" where "parent_id" = (select "value" from "links" where "parent_id" = '47c1956bz31330cv' and "name" = 'revenue' limit 1) and "name" = 'value' limit 1
select "value" from "boolean_links" where "parent_id" = '47c1956bz31330' and "name" = 'verified' limit 1

The original Node.js for Knex.js is here, but really I'm just concerned with how to write this as one regular SQL query, and I can figure out how to make it in Knex.js from there:

async function selectInteractiveInstance(user, name, query) {
  const type = model.types[name]
  const typeId = await baseSchemaController.selectType(name)

  let instance = await knex.from(`links`)
    .select('id')
    .where('parent_id', typeId)
    .where('name', 'instance')
    .first()

  // { id: 123, props: { ... } }
  instance.props = {}

  for (let field in query) {
    let data = query[field]
    let attrSchema = type[field]
    const tableName = baseSchemaController.getTableName(attrSchema.type)

    if (attrSchema.interactive) {
      const query1 = knex
        .from(`links`)
        .select('value')
        .where('parent_id', instance.link)
        .where('name', field)
        .first()

      const record = await knex
        .from(tableName)
        .select('value')
        .where('home', query1)
        .where('name', 'value')
        .first()

      if (record) {
        instance.props[field] = record.value
      }
    } else {
      const record = await knex
        .from(tableName)
        .select('value')
        .where('parent_id', instance.id)
        .where('name', field)
        .first()

      if (record) {
        instance.props[field] = record.value
      }
    }
  }

  return instance
}

The reason for asking is because the number of queries of this function is equal to the number of properties on the object, and I would like to avoid that, but not really that great at SQL yet. I don't see a straightforward or clear path on how to make this into one query, or know if it's possible.

It's also an issue for the following reason. If I want to grab 100 links, and their "fields" (in the primitive link tables), such that the primitive link values match a certain value, then you need to query all field tables simultaneously to see if the query can be satisfied.

Update

I finally landed on a query that works in the optimistic case:

select 
  "x"."id" as "id", 
  "s1"."value" as "name", 
  "s2"."value" as "inc_id", 
  "s3"."value" as "website", 
  "s4"."value" as "revenue", 
  "s5"."value" as "verified" 
from "links" as "x" 
inner join "links" as "c1" on "c1"."parent_id" = "x"."id" 
inner join "string_links" as "s1" on "s1"."parent_id" = "c1"."value" 
inner join "links" as "c2" on "c2"."parent_id" = "x"."id" 
inner join "string_links" as "s2" on "s2"."parent_id" = "c2"."value" 
inner join "links" as "c3" on "c3"."parent_id" = "x"."id" 
inner join "text_links" as "s3" on "s3"."parent_id" = "c3"."value" 
inner join "links" as "c4" on "c4"."parent_id" = "x"."id" 
inner join "integer_links" as "s4" on "s4"."parent_id" = "c4"."value" 
inner join "boolean_links" as "s5" on "s5"."parent_id" = "x"."id" 
where "x"."parent_id" = '47c1956bz31330' 
and "x"."name" = 'link' 
and "c1"."name" = 'name' 
and "s1"."name" = 'value' 
and "c2"."name" = 'inc_id' 
and "s2"."name" = 'value' 
and "c3"."name" = 'website' 
and "s3"."name" = 'value' 
and "c4"."name" = 'revenue' 
and "s4"."name" = 'value' 
and "s5"."name" = 'verified'

This returns an object similar to what I am looking for, joining the same table several times, along with the primitive tables.

However, if any of the values are not linked (are socalled "null" in this context), then the inner join will fail and it will return nothing. How can I still have it return a subset of the object properties, whatever it can find? Is there anything like optional inner joins or anything like that?

3
  • Why you want a single query to select from 4 different tables? You can construct a single query with UNION. Not sure what are the benefits. Commented Aug 18, 2021 at 15:39
  • @Serg I've updated with a working INNER JOIN query, but it fails when any attribute values are missing. How can I still have it return the object with properties as null if the attributes are missing? Commented Aug 18, 2021 at 19:07
  • You can use left join, see the answer. Commented Aug 18, 2021 at 20:01

1 Answer 1

1

Use LEFT JOIN and move possibly unsatisfied predicates to ON clause. Kind of

select 
  "x"."id" as "id", 
  "s1"."value" as "name", 
  "s2"."value" as "inc_id", 
  "s3"."value" as "website", 
  "s4"."value" as "revenue", 
  "s5"."value" as "verified" 
from "links" as "x" 
left join "links" as "c1" on "c1"."parent_id" = "x"."id" and "c1"."name" = 'name'
left join "string_links" as "s1" on "s1"."parent_id" = "c1"."value"  and "s1"."name" = 'value'
left join "links" as "c2" on "c2"."parent_id" = "x"."id" and "c2"."name" = 'inc_id'
left join "string_links" as "s2" on "s2"."parent_id" = "c2"."value" and "s2"."name" = 'value'
left join "links" as "c3" on "c3"."parent_id" = "x"."id" and "c3"."name" = 'website'
left join "text_links" as "s3" on "s3"."parent_id" = "c3"."value"  and "s3"."name" = 'value'
left join "links" as "c4" on "c4"."parent_id" = "x"."id" and "c4"."name" = 'revenue'
left join "integer_links" as "s4" on "s4"."parent_id" = "c4"."value"  and "s4"."name" = 'value'  
left join "boolean_links" as "s5" on "s5"."parent_id" = "x"."id"  and "s5"."name" = 'verified'
where "x"."parent_id" = '47c1956bz31330' 
   and "x"."name" = 'link' 
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.