How to convert CSV file from S3 to JSON format using Python?
Today lesson, we will learn how to convert CSV file from AWS S3 to JSON format using Python script. After that, we will put back the JSON file to S3 bucket.
1. Download and install boto3, CSV, JSON and codecs libraries
$ pip install boto3 $ pip install csv $ pip install json $ pip install codecs
2. Convert CSV file from S3 to JSON format
# importing the boto3 library import boto3 import csv import json import codecs # declare S3 variables and read the CSV content from S3 bucket. targetBucket = '<bucket-name>' csvFile = 'file.csv' jsonFile = 'file.json' # connect to S3 using boto3 client s3_client = boto3.client(service_name='s3') # get S3 object result = s3_client.get_object(Bucket=targetBucket, Key=csvFile) csv_content = result['Body'].read().splitlines() # use CSV reader to read the object and decode the contents. read = csv.reader(codecs.iterdecode(csv_content, 'utf-8')) # convert from CSV to JSON format. line = [] for x in read: test1 = str(x[0]) test2 = str(x[1]) test3 = str(x[2]) y = '{ "test1": ' + '"' + test1 + '"' + ',' \ + ' "test2": ' + '"' + test2 + '"' + ',' \ + ' "test3": ' + '"' + test3 + '"' + '}' line.append(y) # put back the JSON file to S3 bucket. s3_client.put_object( Bucket=targetBucket, Body= str(line).replace("'",""), Key=jsonFile, ServerSideEncryption='AES256' )
Thatβs it. I hope this will help. ?
Photo: https://www.brandeps.com