0

Kindly assist in deciding the strategy to do Extraction, Transformation and Loading (ETL) work flow with Amazon AWS offerings. I am a newbie in Amazon Cloud. My use case is to read thousands of record rows from Microsoft CSV file. My intention is to write out this file into Microsoft Excel document. I want to store this excel file in single piece within S3 bucket as an object. Currently, I am doing this proof of concept with AWS Lambda. My issue is /tmp size is exceeding beyond 512 MB even if I select 6 GB RAM size in lambda specification. I do not do any disk operation. I read all of S3 csv file content once, in RAM and write out to excel, in RAM.

3
  • Can you please check container lambdas. They have 10GB of space. Commented Mar 15, 2021 at 8:29
  • Existing lambda from Dec-20 also offer 10 GB Commented Mar 16, 2021 at 14:12
  • I mean 10GB of disk space. Existing lambda offer only 10 GB of ram and 512 MB disk space. Commented Mar 16, 2021 at 23:13

1 Answer 1

0

You should consider using aws glue for large volume data transformations. Lambda may be Ill suited to the task

Sign up to request clarification or add additional context in comments.

2 Comments

We are doing aws glue work flow. I cannot export to Microsoft excel from both AWS Glue and AWS Athena
You can do it in the python environment of glue. Do not use pyspark. U will need to do the transformation using pandas.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.