0

I'm trying to parse XLS files from Google Docs with PHP. It works fine when I manually download a file and then upload it to the server, but when I use PHP to save the exact same XLS file to the server directly, instead of getting all the data in the XLS, the response is:

<b>DOM ELEMENT: </b>HTML<br /><b>ATTRIBUTE: </b>lang => en<br /><b>DOM ELEMENT: </b>HEAD<br /><b>DOM ELEMENT: </b>META<br /><b>ATTRIBUTE: </b>charset => utf-8<br /><b>DOM ELEMENT: </b>META<br /><b>ATTRIBUTE: </b>content => width=300, initial-scale=1<br /><b>ATTRIBUTE: </b>name => viewport<br /><b>DOM ELEMENT: </b>META<br /><b>ATTRIBUTE: </b>name => description<br /><b>ATTRIBUTE: </b>content => Create a new spreadsheet and edit with others at the same time -- from your computer, phone or tablet. Get stuff done with or without an internet connection. Use Sheets to edit Excel files. Free from Google.<br /><b>DOM ELEMENT: </b>TITLE<br />

Here's an example of how I use PHP to save the XLS to the server:

$fileName = 'xls/newday2014.xls';
$xlsURL = 'https://docs.google.com/spreadsheets/d/1KKMiBOlvpKaAJ_MsNfaWGmR6ixL53AjAaLf0R18X3e4/edit#gid=161299136';
file_put_contents($fileName, file_get_contents($xlsURL));
1

2 Answers 2

5
+25

You're missing some fundamental things here with your three liner code:

  • file_get_contents is not a browser. Any URI (URL) it takes can not have the fragment (in your case #gid=161299136) because this is never send to the server.
  • The last point also highlights: If you used that exact URI to download with your browser, there is most likely something running in your browser for the download before the correct download URI is created. So you're using the wrong URI to download.
  • file_get_contents does not log you into google accounts just by magic.
  • Just making a filename ending with .xls does not change the file-format from HTML magically into an Excel Spreadsheet.

As these are already four fundamental problems with your three lines of code, it should be obvious that the code you're using is unfitable to high degree for what you try to do. I suggest you throw it away and start from scratch doing some research first, e.g. contact the vendor of that webservice for your support options as a PHP developer. Most likely they offer an API for what you're trying to do.

Sign up to request clarification or add additional context in comments.

Comments

3

Most likely, a cookie is set to your browser, when you log in into Google Docs - this cookie is not present on the file_get_contents($xlsURL) call, so you get different content. The web debugger of your choice will confirm that, so does pasting your URL into a not-logged-in browser.

The cURL extension can hand cookies to the server, but please understand, that this cookie is dynamic - so getting it out of your browser and into cURL is by far not enough. Most likely you will have to walk the complete way from login to the document, including the need to update, whenever Google choses to update.

1 Comment

Note: Google probably has an API for this, but I don't know off the top of my head.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.