1

I have a question in postgresql; I've tried using a cursor for the following but could not get this to work.

I have a large employee table that has multiple columns, one of them being the company name, see below:

master table

The master table is not sorted by date, it needs to be sorted during the filtering process. This table can have more than 100,000 records; I want to create a 2nd table that will have only some of the records (by each company name) - essentially, get the rows for each company name by pre-defined number of records and then union the tables together into a new table. My pre-defined records table would look something like this

required records

Ideally, I could have more than 1000 companies in my original, master table and I might need only 50 companies and some records for the 50 companies that will be defined in my records table. How can this be done in postgresql

Sample Output: Records are the number of records that are mentioned in the pre-defined records desired table and are sorted by date

output

7
  • 1
    It will be good if you provide the sample of the output you want your query to return based upon these 2 tables. Commented Jul 12, 2021 at 22:05
  • @mukund thank you for the suggestion - I added the sample output. Please let me know if you know how this could work. I've tried using cursor but could not get it to work Commented Jul 12, 2021 at 22:16
  • are you not simply filtering your master table based upon the entries in predefined records table ? select * from employee_table where company_name in (select company_name from pre_defined_records_table) Commented Jul 12, 2021 at 22:21
  • And if you also want number of records then simply join them ? select * from employee_table t1 left join pre_defined_records_table t2 on t1.company_name = t2.company_name where t1.company_name in (select company_name from pre_defined_records_table) Commented Jul 12, 2021 at 22:24
  • No, it's based on the number of required records i.e. the select query should only return x number of records which are specified in the 2nd table. So Apple has 3 records and Amazon has 2 in the output as compared to 6 and 5 records I would get if I just did what you said in your comment. Commented Jul 12, 2021 at 22:25

1 Answer 1

1

You can left join and filter the master table with the help of predefined number of records table. Then create a rank-ordered by date which can further nested and filtered using where condition.

select t3.* from (

    select 
        t1.* 
        , t2.number_of_records 
        , RANK() OVER (PARTITION BY t1.company_name ORDER BY t1.date_column asc) ranks
    from employee_table t1 
    left join pre_defined_records_table t2 
        on t1.company_name = t2.company_name
        where t1.company_name in (select company_name
    from pre_defined_records_table)
    
) t3
where t3.ranks <= t3.number_of_records

I have used the dummy names for your columns and tables as the sample data is not reproducible. PS - Check also Dense_Rank() function.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you @mukund. This is largely there, I would need to tweak it a little, but this points me exactly how to go about it. I was using cursor which had made it overly complicated in postgresql, it would have been easy with a cursor in Microsoft SQL. Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.