6

For a project i need to develop an ETL process (extract transform load) that reads data from a (legacy) tool that exposes its data on a REST API. This data needs to be stored in amazon S3.

I really like to try this with apache nifi but i honestly have no clue yet how i can connect with the REST API, and where/how i can implement some business logic to 'talk the right protocol' with the source system. For example i like to keep track of what data has been written so far so it can resume loading where it left of.

So far i have been reading the nifi documentation and i'm getting a better insight what the tool provdes/entails. However it's not clear to be how i could implement the task within the nifi architecture.

Hopefully someone can give me some guidance?

Thanks, Paul

1 Answer 1

3

The InvokeHTTP processor can be used to query a REST API.

Here is a simple flow that

  1. Queries the REST API at https://api.exchangeratesapi.io/latest every 10 minutes
  2. Sets the output-file name (exchangerates_<ID>.json)
  3. Stores the query response in the output file on the local filesystem (under /tmp/data-out)

enter image description here

I exported the flow as a NiFi template and stored it in a gist. The template can be imported into a NiFi instance and run as is.

Sign up to request clarification or add additional context in comments.

3 Comments

How can you handle pagination in the Nifi REST api call that returns next link to get next data page?
What about Strong Authentication, tokens, etc.?
@thebluephantom The processor supports multiple authentication methods. See the docs reference.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.