Kindly assist in deciding the strategy to do Extraction, Transformation and Loading (ETL) work flow with Amazon AWS offerings. I am a newbie in Amazon Cloud. My use case is to read thousands of record rows from Microsoft CSV file. My intention is to write out this file into Microsoft Excel document. I want to store this excel file in single piece within S3 bucket as an object. Currently, I am doing this proof of concept with AWS Lambda. My issue is /tmp size is exceeding beyond 512 MB even if I select 6 GB RAM size in lambda specification. I do not do any disk operation. I read all of S3 csv file content once, in RAM and write out to excel, in RAM.
3
-
Can you please check container lambdas. They have 10GB of space.Marcin– Marcin2021-03-15 08:29:16 +00:00Commented Mar 15, 2021 at 8:29
-
Existing lambda from Dec-20 also offer 10 GBPrakashsinha Bayas– Prakashsinha Bayas2021-03-16 14:12:51 +00:00Commented Mar 16, 2021 at 14:12
-
I mean 10GB of disk space. Existing lambda offer only 10 GB of ram and 512 MB disk space.Marcin– Marcin2021-03-16 23:13:42 +00:00Commented Mar 16, 2021 at 23:13
Add a comment
|
1 Answer
You should consider using aws glue for large volume data transformations. Lambda may be Ill suited to the task
2 Comments
Prakashsinha Bayas
We are doing aws glue work flow. I cannot export to Microsoft excel from both AWS Glue and AWS Athena
Emerson
You can do it in the python environment of glue. Do not use pyspark. U will need to do the transformation using pandas.