0

I'm posting it here because I couldn't' find any such scenario on the web so far. I have a webpage which contains a set of reports both in XLS and PDF formats. I should be downloading the excel files from the page and load into my database. I wish I could use the URL for XLS file directly but the problem is the naming convention may keep changing every time (Sales_Quarter1.xlsx can be Sales_Q1.xlsx the next year). The only thing that would be constant in the following example is "Sales for Calendar Year". I should be looking up for the file that corresponds to this text and download it before loading it into database table.

I would like to know from experts if this would be possible?

<li>
   <sub>Sales for Calendar Year 2015--All Countries&#160;</sub> 
   <a href="/Data/Downloads/Documents/Sales/Sales_Quarter1.xlsx"> 
   <sub>[XLS]</sub></a><sub>&#160;, <a href="/Data/Downloads/Documents/Sales/Sales_Quarter1.pdf"><sub>[PDF]</sub></a><sub>​</sub></sub>
</li>

PS: I am using SQL Server 2014.

Thanks!

4
  • Yes it would be possible. However, you didn't provide any actual details about...well...anything so there really isn't much anybody can do to offer any help. Commented Oct 28, 2015 at 13:48
  • Hello Sean, Thanks for replying. Please let me know what else you would like to see so that I can provide all those details. This is my first post here so was not sure. Commented Oct 28, 2015 at 14:15
  • Here is a great place to start. spaghettidba.com/2015/04/24/… Commented Oct 28, 2015 at 14:19
  • Thanks Sean. The reason I didn't post any DDL/DML is that I am only looking for how to download the file from web using SSIS based on the condition I've mentioned. Once I get the way to download the file, I can drive by myself. Commented Oct 28, 2015 at 15:56

1 Answer 1

0

Have a look at Integration Services. Create a package for both pulling the web page using a script task, along with a variable name that will represent your downloaded, local filenames for the html file and excel files (you will also have to parse the link out of the html file). Then utilize an Excel Source next in your package.

The variable name for the excel file used in the script task will need to be set to ReadWrite as well.

You can also schedule the resulting package execution via SQL Agent job, if you plan to run this on a reoccurring basis, placing logic into the script or the execution paths,

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks Andrew for the script task resource. I tried implementing the same and getting the following error. Error: Failed to lock variable "User::RemoteUri" for read access with error 0xC0010001 "The variable cannot be found. This occurs when an attempt is made to retrieve a variable from the Variables collection on a container during execution of the package, and the variable is not there. The variable name may have changed or the variable is not being created.".
You need to set the variable usage in the script task to readwrite
I did set the variable usage to readwrite mode in the Script Task and when I execute it, it executes forever.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.