0

I am writing a sql query where i am getting the key/column name of the SQL query from the user and when i am writing a parameterized query in python then i am getting the key name as $1,$2 instead i want the name of the column.

i have tried this in my python sdk for the cosmos db no sql

queryText ="SELECT @id ,c.source_name,c.field_name FROM c"
item_list = list(container.query_items(query=queryText,parameters=[{"name":'@id','value':'id_name'}],enable_cross_partition_query=True))

and in the output i am getting the value as

[{'$1': 'id_value',
  'source_name': 'source_name_value',
  'field_name': 'field_name_value'},
.......
]

but i want the '$1' to be replaced by string "id_name" I can't do it in my python to change the name as i have many columns the above code is just to understand the context and i have a huge amount of data getting returned .so changing the data in the script will take a lot of compute. Also string concatentation for the SQL query has a chance of the SQL injection in my application

2
  • Can't you use this as th sql query?SELECT @id AS id_name, c.source_name ... Commented Sep 1, 2024 at 10:02
  • I want to avoid sql injection thats why thinking to avoid this way Commented Sep 1, 2024 at 10:23

1 Answer 1

2

This is not possible using parameters. If you want dynamic projection for your query you need to build the SQL string with the properties to project in the text itself before sending to the service.

Cosmos SQL syntax is read-only limiting the impact of SQL injection. It is impossible to insert/update/delete data/containers/databases using SQL in Cosmos.

If you are trying to build a free-form query feature that uses aliases for actual property names in your documents then you will need to build some sort of schema management using a key-value lookup to construct the query string, then sending that to the service to execute. You can store this itself in a Cosmos container and use a point-read to fetch the data and store in-memory for fast lookups at run-time.

One area of caution here is this type of query capabilities for users can often have performance issues if not designed correctly. It is important you avoid high-volumes of cross-partition queries, especially for containers with a large amount of data. A frequent solution is to have multiple containers that can route users' queries to the right container with partition keys that can run queries in-partition, not cross-partition. Another technique is to have a look-up container(s) with small subsets of data that can provide the partition key and id value (or filter predicate values for a query) for the data being searched. There are many other techniques you can apply here as well.

The key is that if you want this to be scalable and efficient you need to do the following:

  1. Measure how expensive a single look-up query is that runs cross-partition.
  2. Multiply that by the expected number of concurrent users/queries to calculate the RU/s needed to run the user lookup feature.
  3. Explore ways to serve these queries in-partition using this or other techniques. Measure the performance and cost of those as well, then compare which is faster and cheaper.

These steps apply no matter what you are doing in Cosmos. Cosmos is designed to be extremely fast, efficient and massively scalable, but you have to apply correct design principles to take advantage of it.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.