How to Upload Object into S3 with multipart-upload using S3 API CLI


In this post, I will reproduce how to upload object with multipart. I can use the "awscli s3api" command for this. Before I study, I though AWS did all of process such as split, upload and resemble. However, it is not. (I am not sure if SDK can do all of these). I will write from split to complete multipart-upload.


1. Split object


Before upload object, I need to split object by the size which I want to upload. In my case, I define almost 40MB. And "split" command is used to split object by specific size. "split -b 40000000 sample/TWICE「BDZ」Music\ Video.mp4 splited" means that the "sample/TWICE「BDZ」Music\ Video.mp4" file will be splite by the size and the splited file names are prefixed with "splited". Therefore, I can see the 2 file which splited.


# split -b 40000000 sample/TWICE「BDZ」Music\ Video.mp4 splited


# ls -la spliteda*

-rw-r--r-- 1 root root 40000000 Sep 11 09:52 splitedaa

-rw-r--r-- 1 root root 21067922 Sep 11 09:52 splitedab


I will upload these files with multipart-upload.


2. Create Multipart-Upload


There are several stpes to use multipart-upload. At first, I create multi-part upload. This step will notice to S3 that I have something with large size to upload.


# aws s3api create-multipart-upload --bucket s3-multistore-bucket --key 'TWICE「BDZ」Music Video.mp4'

{

    "Bucket": "s3-multistore-bucket",

    "UploadId": "j.kLRorRoj1WEDV9iH1jrIeee5KETBL3raUH5odcycSxsl0RZY4p9Q.WK04lL9c7tsUxmDXwGEHwVlgm_MR.La4IHkM1M5xNQrCXossn5L_nXKJ0v.9_B3mNbL8GSoE8WBToEMPfELYF3VPh3g5PHg--",

    "Key": "TWICE「BDZ」Music Video.mp4"

}


After run this command, I can get "UploadId". Please note this value. This value will be used for next steps.


3. Upload-part


So far, any file is not uploaded. I need to upload these splited files with upload-part command. In this command, there are 2 options which I need to focus. They are "upload-id" and "part-number". "upload-id" is kind of the credential. "part-number" is the sequential value to assemble at end of process.


# aws s3api upload-part --bucket s3-multistore-bucket --key 'TWICE「BDZ」Music Video.mp4' --part-number 1 --body splitedaa --upload-id  "j.kLRorRoj1WEDV9iH1jrIeee5KETBL3raUH5odcycSxsl0RZY4p9Q.WK04lL9c7tsUxmDXwGEHwVlgm_MR.La4IHkM1M5xNQrCXossn5L_nXKJ0v.9_B3mNbL8GSoE8WBToEMPfELYF3VPh3g5PHg--"

{

    "ETag": "\"45de77ba3c71c3105a358971b7464b5a\""

}

root@ip-172-22-1-179:~# aws s3api upload-part --bucket s3-multistore-bucket --key 'TWICE「BDZ」Music Video.mp4' --part-number 2 --body splitedab --upload-id  "j.kLRorRoj1WEDV9iH1jrIeee5KETBL3raUH5odcycSxsl0RZY4p9Q.WK04lL9c7tsUxmDXwGEHwVlgm_MR.La4IHkM1M5xNQrCXossn5L_nXKJ0v.9_B3mNbL8GSoE8WBToEMPfELYF3VPh3g5PHg--"

{

    "ETag": "\"7fa82dacd6f59e523e60058660e1ab6f\""

}


I uploaded 2 files into S3. However, I can not see the S3 GUI. Because it is not completed object yet. So I need to finish this process.


4. Create "multipart" file


I have some questions. How does AWS know the order to assemble to make single object. I have already known there is part-number. However, this information is only used to mark this object. AWS can not know what is the begining and last. Because of this, I need to create "multipart" file to define single object. In this manual, there is format for this file.


# vi multipart.file

{

 "Parts":[

   {

      "ETag": "\"45de77ba3c71c3105a358971b7464b5a\"",

      "PartNumber" : 1

   },

   {

      "ETag": "\"7fa82dacd6f59e523e60058660e1ab6f\"",

      "PartNumber" : 2

   }

 ]

}


Now, I am ready to finish these multipart upload.


5. Complete multipart upload


I will run command "complet-multipart-upload" to finish all of process.


# aws s3api complete-multipart-upload --multipart-upload file://multipart.file --bucket s3-multistore-bucket --key 'TWICE「BDZ」Music Video.mp4' --upload-id j.kLRorRoj1WEDV9iH1jrIeee5KETBL3raUH5odcycSxsl0RZY4p9Q.WK04lL9c7tsUxmDXwGEHwVlgm_MR.La4IHkM1M5xNQrCXossn5L_nXKJ0v.9_B3mNbL8GSoE8WBToEMPfELYF3VPh3g5PHg--

{

    "ETag": "\"a1a27863cd20501d3e564e10b461d99f-2\"",

    "Bucket": "s3-multistore-bucket",

    "Location": "https://s3-multistore-bucket.s3.ap-northeast-2.amazonaws.com/TWICE%E3%80%8CBDZ%E3%80%8DMusic+Video.mp4",

    "Key": "TWICE「BDZ」Music Video.mp4"

}


Now, I can see this object on S3 GUI. Look at the "ETag", there is "-2". This means the object are uploaded with multipart-uplad and the splited objects was 2. This is the reason why the "ETag" of some objects is not matched. 


Reference


[ 1 ] https://docs.aws.amazon.com/cli/latest/reference/s3api/index.html

[ 2 ] https://www.computerhope.com/unix/usplit.htm

+ Recent posts