0

I'm subscribed to a data service that sends out 2 POST requests every minute.

The content type is text/XML:

{'Host': 'my.host.com', 'X-Real-Ip': '111.1.181.11', 'X-Forwarded-For': '111.1.181.11', 'X-Forwarded-Proto': 'https', 'Connection': 'close', 'Content-Length': '8556', 'Content-Type': 'text/xml', 'Accept-Encoding': 'identity,gzip', 'Content-Encoding': 'gzip', 'Soapaction': '"http://datex2.eu/wsdl/supplierPush/2_0/putDatex2Data"', 'User-Agent': 'Jakarta Commons-HttpClient/3.0'}

I save the request to a file so that I can inspect its contents using the following function:

def receive_post_request_and_save_it_to_file(request):

    if request.method == "GET":
        return redirect("/")

    elif request.method == "POST":
        if request.content_type == "text/xml" and request.body:
            try:
                filename = "my-request.xml"
                file_path = "request-files/"
                if not os.path.exists(path):
                    os.makedirs(path)
                with open(os.path.join(path, filename), 'wb') as f:
                    f.write(data_to_process)
                return HttpResponse(status=200)

            except Exception as e:
                print("Something went wrong => {}".format(e))
                return HttpResponse(status=500)
        
    else:
        return HttpResponse(status=404)

Instead of receiving an XML file as I used to from other providers, I receive binary code that I can not open:

��]�n�8�}�WyTLRԅ��T��Q@.�$�:g^����%C��J�̼�g�[��ٺ9�-鷺ɞmL�S1o"�6�E�\;\f�����Aϲ�|�����2���(o���x�������,��u�ix<
�R�g��(��yv����E�Y]⺰8*�7q�WM�C1�ǎ��~[&딏��'�N�4�����M��M�.�(����|��h���2��w��Z&o��ϳqTT���mC㻻���^�e�$�o9��Ƣ~���!;��:[Iz�mC����4J�5`�Z,f��?�:-�IR?|��UZ�O�X����0�"�Y�X�߼��77އÞ�p����s��,��/��Y]�Q<-��㛤XUa����r�;�54e�~���H0��0���B���[����8�N��{���rs
�+{�6f���4�*Wr�$|�x���?�$m�d�mڦ:m�MNu��$����N����,)�Fiv��E����ɛ�a������r������p��u���8��Λ�E�:�f��x���.�4��sx�q���������z�m������
yv�%%�W���x��F��X�E�����}\����@�I����ӯMx�����í��m�N��J�g�2/^j�N�6��4��o=y۔�u�����h��C����:�2�I�A/�y���C�^6�v��+��C�,�����M��1�W������|�%�[̾���~\��RWY�͒tz<J��t<��tPd�d/�����4O�9���(��E
��

The XML file that I should be receiving looks like this:

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"><soap:Body><d2LogicalModel xmlns="http://datex2.eu/schema/2/2_0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" modelBaseVersion="2" xsi:schemaLocation="http://datex2.eu/schema/2/2_0 http://sdbby.heuboe.de:11447/d2Schema/StrategicRouting.xsd"><exchange><supplierIdentification><country>de</country><nationalIdentifier>DE-BY-SDB</nationalIdentifier></supplierIdentification></exchange><payloadPublication xsi:type="SituationPublication" lang="de"><publicationTime>2020-07-22T16:21:00+02:00</publicationTime><publicationCreator><country>de</country><nationalIdentifier>DE-BY-SDB</nationalIdentifier></publicationCreator><situation id="S1595414940876" version="1"><situationVersionTime>2020-07-22T12:49:00+02:00</situationVersionTime><headerInformation><confidentiality>noRestriction</confidentiality><informationStatus>real</informationStatus></headerInformation><situationRecord xsi:type="GeneralNetworkManagement" id="R1595414940876" version="1"><situationRecordCreationTime>2020-07-22T12:49:00+02:00</situationRecordCreationTime><situationRecordVersionTime>2020-07-22T12:49:00+02:00</situationRecordVersionTime><probabilityOfOccurrence>certain</probabilityOfOccurrence><validity><validityStatus>definedByValidityTimeSpec</validityStatus><validityTimeSpecification><overallStartTime>2020-07-22T12:49:00+02:00</overallStartTime><overallEndTime>2020-07-22T16:23:00+02:00</overallEndTime></validityTimeSpecification></validity><cause xsi:type="NonManagedCause"><causeDescription><values><value lang="de">Optimierte Zielführung zu den Parkplätzen Messe</value><value lang="en">improved routing to parking area Munich Trade Fair</value></values></causeDescription><causeType>other</causeType></cause><groupOfLocations xsi:type="Area"><alertCArea><alertCLocationCountryCode>D</alertCLocationCountryCode><alertCLocationTableNumber>1</alertCLocationTableNumber><alertCLocationTableVersion>15.1</alertCLocationTableVersion><areaLocation><alertCLocationName><values><value lang="de">Großraum München</value></values></alertCLocationName><specificLocation>548</specificLocation></areaLocation></alertCArea></groupOfLocations><actionPlanIdentifier>S-14c</actionPlanIdentifier><operatorActionStatus>implemented</operatorActionStatus><complianceOption>advisory</complianceOption><generalNetworkManagementType>other</generalNetworkManagementType><generalNetworkManagementExtension><generalNetworkManagementExtended xsi:type="StrategicRouteManagement"><nameOfRouteManagement><values><value lang="de">A94 West - Riem 2 Parkhaus West</value>
                            </values>
                        </nameOfRouteManagement><triggerOrigin><location xsi:type="Point"><pointByCoordinates><pointCoordinates><latitude>48.137062</latitude><longitude>11.617391</longitude>
    </pointCoordinates>
</pointByCoordinates><pointExtension><openlrExtendedPoint><openlrPointLocationReference><openlrPointAlongLine><openlrSideOfRoad>right</openlrSideOfRoad><openlrOrientation>withLineDirection</openlrOrientation><openlrPositiveOffset>346</openlrPositiveOffset><openlrLocationReferencePoint><openlrCoordinate><latitude>48.13819</latitude><longitude>11.6146</longitude>
                    </openlrCoordinate><openlrLineAttributes><openlrFunctionalRoadClass>FRC2</openlrFunctionalRoadClass><openlrFormOfWay>singleCarriageway</openlrFormOfWay><openlrBearing>182</openlrBearing>
                    </openlrLineAttributes><openlrPathAttributes><openlrLowestFRCToNextLRPoint>FRC2</openlrLowestFRCToNextLRPoint><openlrDistanceToNextLRPoint>738</openlrDistanceToNextLRPoint>
                    </openlrPathAttributes>
                </openlrLocationReferencePoint><openlrLastLocationReferencePoint><openlrCoordinate><latitude>48.13779</latitude><longitude>11.62253</longitude>
                    </openlrCoordinate><openlrLineAttributes><openlrFunctionalRoadClass>FRC2</openlrFunctionalRoadClass><openlrFormOfWay>multipleCarriageway</openlrFormOfWay><openlrBearing>241</openlrBearing>
                    </openlrLineAttributes>
                </openlrLastLocationReferencePoint>
            </openlrPointAlongLine>
        </openlrPointLocationReference>
    </openlrExtendedPoint>
</pointExtension>
                            </location>
                        </triggerOrigin><route>

Instead of receiving an XML request, I receive a binary one.

Q:

Is there a different way to parse this data?

Did the data provider send the wrong data?

I hope that someone can guide me in the right direction.

Thanks in advance!

1 Answer 1

1

If you look at the response headers at the target website you will see:

Content-Encoding: gzip

Which means the content is encoded using gzip (see https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Encoding).

On a normal web browser you wouldn't ever see this because the browser is doing the decoding automatically.

Python has a module called zlib with specific commands for decoding gzip encoding (see https://docs.python.org/2/library/gzip.html).

Once you've decoded the data you should be able to process it like normal XML.

Sign up to request clarification or add additional context in comments.

1 Comment

Yes the gzip library decoded the contents! Using: gzip.decompress(data) was exactly what I needed. thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.