Advertisement
  1. Code
  2. Cloud & Hosting

Google Cloud Storage: Managing Files and Objects

Scroll to top
Read Time: 6 min
This post is part of a series called Google Cloud Storage.
Google Cloud Storage: Managing Buckets

In the first part of this two-part tutorial series, we had an overview of how buckets are used on Google Cloud Storage to organize files. We saw how to manage buckets on Google Cloud Storage from Google Cloud Console. This was followed by a Python script in which these operations were performed programmatically.

In this part, I will demonstrate how to manage objects, i.e. files and folders inside GCS buckets. The structure of this tutorial will be similar to that of the previous one. First I will demonstrate how to perform basic operations related to file management using Google Cloud Console. This will be followed by a Python script to do the same operations programmatically.

Just as bucket naming in GCS had some guidelines and constraints, object naming follows a set of guidelines as well. Object names should contain valid Unicode characters and should not contain Carriage Return or Line Feed characters. Some recommendations include not to have characters like "#", "[", "]", "*", "?" or illegal XML control characters because they can be interpreted wrongly and can lead to ambiguity.

Also, object names in GCS follow a flat namespace. This means physically there are no directories and subdirectories on GCS. For example, if you create a file with name /tutsplus/tutorials/gcs.pdf, it will appear as though gcs.pdf resides in a directory named tutorials which in turn is a subdirectory of tutsplus. But according to GCS, the object simply resides in a bucket with the name /tutsplus/tutorials/gcs.pdf.

Let's look at how to manage objects using Google Cloud Console and then jump onto the Python script to do the same thing programmatically.

Using Google Cloud Console

I will continue from where we left in the last tutorial. Let's start by creating a folder.

Create a folder or upload files directly to GCSCreate a folder or upload files directly to GCSCreate a folder or upload files directly to GCS

To create a new folder, click on the Create Folder button highlighted above. Create a folder by filling in the desired name as shown below. The name should follow the object naming conventions.

Creating a folder in GCSCreating a folder in GCSCreating a folder in GCS

Now let's upload a file in the newly created folder.

Uploading a file in GCSUploading a file in GCSUploading a file in GCS

After the creation, the GCS browser will list the newly created objects. Objects can be deleted by selecting them from the list and clicking on the delete button.

Delete an object from GCSDelete an object from GCSDelete an object from GCS

Clicking on the refresh button will populate the UI with any changes to the list of objects without refreshing the whole page.

Managing Objects Programmatically

In the first part, we saw how to create a Compute Engine instance. I will use the same here and build upon the Python script from the last part.

Writing the Python Script

There are no additional installation steps that need to be followed for this tutorial. Refer to the first part for any more details about installation or development environment.

gcs_objects.py

1
import sys
2
from pprint import pprint
3
4
from googleapiclient import discovery
5
from googleapiclient import http
6
from oauth2client.client import GoogleCredentials
7
8
9
def create_service():
10
    credentials = GoogleCredentials.get_application_default()
11
    return discovery.build('storage', 'v1', credentials=credentials)
12
13
14
def list_objects(bucket):
15
    service = create_service()
16
    # Create a request to objects.list to retrieve a list of objects.

17
    fields_to_return = \
18
        'nextPageToken,items(name,size,contentType,metadata(my-key))'
19
    req = service.objects().list(bucket=bucket, fields=fields_to_return)
20
21
    all_objects = []
22
    # If you have too many items to list in one request, list_next() will

23
    # automatically handle paging with the pageToken.

24
    while req:
25
        resp = req.execute()
26
        all_objects.extend(resp.get('items', []))
27
        req = service.objects().list_next(req, resp)
28
    pprint(all_objects)
29
30
31
def create_object(bucket, filename):
32
    service = create_service()
33
    # This is the request body as specified:

34
    # https://g.co/cloud/storage/docs/json_api/v1/objects/insert#request

35
    body = {
36
        'name': filename,
37
    }
38
    with open(filename, 'rb') as f:
39
        req = service.objects().insert(
40
            bucket=bucket, body=body,
41
            # You can also just set media_body=filename, but for the sake of

42
            # demonstration, pass in the more generic file handle, which could

43
            # very well be a StringIO or similar.

44
            media_body=http.MediaIoBaseUpload(f, 'application/octet-stream'))
45
        resp = req.execute()
46
    pprint(resp)
47
    
48
    
49
def delete_object(bucket, filename):
50
    service = create_service()
51
    res = service.objects().delete(bucket=bucket, object=filename).execute()
52
    pprint(res)
53
    
54
    
55
def print_help():
56
        print """Usage: python gcs_objects.py <command>

57
Command can be:

58
    help: Prints this help

59
    list: Lists all the objects in the specified bucket

60
    create: Upload the provided file in specified bucket

61
    delete: Delete the provided filename from bucket

62
"""
63
if __name__ == "__main__":
64
    if len(sys.argv) < 2 or sys.argv[1] == "help" or \
65
        sys.argv[1] not in ['list', 'create', 'delete', 'get']:
66
        print_help()
67
        sys.exit()
68
    if sys.argv[1] == 'list':
69
        if len(sys.argv) == 3:
70
            list_objects(sys.argv[2])
71
            sys.exit()
72
        else:
73
            print_help()
74
            sys.exit()
75
    if sys.argv[1] == 'create':
76
        if len(sys.argv) == 4:
77
            create_object(sys.argv[2], sys.argv[3])
78
            sys.exit()
79
        else:
80
            print_help()
81
            sys.exit()
82
    if sys.argv[1] == 'delete':
83
        if len(sys.argv) == 4:
84
            delete_object(sys.argv[2], sys.argv[3])
85
            sys.exit()
86
        else:
87
            print_help()
88
            sys.exit()

The above Python script demonstrates the major operations that can be performed on objects. These include:

  • creation of a new object in a bucket
  • listing of all objects in a bucket
  • deletion of a specific object

Let's see how each of the above operations looks when the script is run.

1
$ python gcs_objects.py 
2
Usage: python gcs_objects.py <command>
3
Command can be:
4
    help: Prints this help

5
    list: Lists all the objects in the specified bucket
6
    create: Upload the provided file in specified bucket
7
    delete: Delete the provided filename from bucket
8
    
9
$ python gcs_objects.py list tutsplus-demo-test
10
[{u'contentType': u'application/x-www-form-urlencoded;charset=UTF-8',
11
  u'name': u'tutsplus/',
12
  u'size': u'0'},
13
 {u'contentType': u'image/png',
14
        resp = req.execute()
15
  u'name': u'tutsplus/Screen Shot 2016-10-17 at 1.03.16 PM.png',
16
  u'size': u'36680'}]
17
  
18
$ python gcs_objects.py create tutsplus-demo-test gcs_buckets.py 
19
{u'bucket': u'tutsplus-demo-test',
20
 u'contentType': u'application/octet-stream',
21
 u'crc32c': u'XIEyEw==',
22
 u'etag': u'CJCckonZ4c8CEAE=',
23
 u'generation': u'1476702385770000',
24
 u'id': u'tutsplus-demo-test/gcs_buckets.py/1476702385770000',
25
 u'kind': u'storage#object',
26
 u'md5Hash': u'+bd6Ula+mG4bRXReSnvFew==',
27
 u'mediaLink': u'https://www.googleapis.com/download/storage/v1/b/tutsplus-demo-test/o/gcs_buckets.py?generation=147670238577000

28
0&alt=media',
29
 u'metageneration': u'1',
30
 u'name': u'gcs_buckets.py',
31
 u'selfLink': u'https://www.googleapis.com/storage/v1/b/tutsplus-demo-test/o/gcs_buckets.py',
32
 u'size': u'2226',
33
 u'storageClass': u'STANDARD',
34
 u'timeCreated': u'2016-10-17T11:06:25.753Z',
35
 u'updated': u'2016-10-17T11:06:25.753Z'}
36
 
37
$ python gcs_objects.py list tutsplus-demo-test
38
[{u'contentType': u'application/octet-stream',
39
  u'name': u'gcs_buckets.py',
40
  u'size': u'2226'},
41
 {u'contentType': u'application/x-www-form-urlencoded;charset=UTF-8',
42
  u'name': u'tutsplus/',
43
  u'size': u'0'},
44
 {u'contentType': u'image/png',
45
  u'name': u'tutsplus/Screen Shot 2016-10-17 at 1.03.16 PM.png',
46
  u'size': u'36680'}]
47
  
48
$ python gcs_objects.py delete tutsplus-demo-test gcs_buckets.py 
49
''
50
51
$ python gcs_objects.py list tutsplus-demo-test
52
[{u'contentType': u'application/x-www-form-urlencoded;charset=UTF-8',
53
  u'name': u'tutsplus/',
54
  u'size': u'0'},
55
 {u'contentType': u'image/png',
56
  u'name': u'tutsplus/Screen Shot 2016-10-17 at 1.03.16 PM.png',
57
  u'size': u'36680'}]

Conclusion

In this tutorial series, we saw how Google Cloud Storage works from a bird's eye view, which was followed by in-depth analysis of buckets and objects. We then saw how to perform major bucket and object related operations via Google Cloud Console. 

Then we performed the same using Python scripts. There is more that can be done with Google Cloud Storage, but that is left for you to explore.

Advertisement
Did you find this post useful?
Want a weekly email summary?
Subscribe below and we’ll send you a weekly email summary of all new Code tutorials. Never miss out on learning about the next big thing.
Advertisement
Looking for something to help kick start your next project?
Envato Market has a range of items for sale to help get you started.