Connecting Google Drive

edit

Connecting Google Driveedit

Google Drive is a cloud-based storage service for organizations of all sizes, with a focus on G Suite document (Google Docs, Sheets, Slides, etc) storage and collaboration. The Google Drive connector provided with Workplace Search automatically captures, syncs and indexes the following items:

G Suite Documents

Including ID, File Metadata, File Content, Collaborators, and Timestamps

Stored Files

Including ID, File Metadata, File Content, Updated by, and Timestamps

Configuring the Google Drive Connectoredit

Configuring the Google Drive connector is the first step prior to connecting the Google Drive service to Workplace Search, and requires that you create an OAuth App from the Google Drive platform.


Step 1. To get started, log in to the Google Developer Console.

Make sure that you create this project with a trusted and stable Google account. Personal documents of the Google account used to connect are also indexed. We recommend using a team-owned account.


Step 2. We must first create a project, add access to the Google Drive API to it. Click Create when prompted:

Figure 42. Connecting Google Drive

If you already have active projects, you can create a new project from the Project Dropdown menu:

Figure 43. Connecting Google Drive

Fill in the project information in a way that best suits your organization.


Step 3. Using the search bar located at the top of the console, navigate to the Google Drive API:

Figure 44. Connecting Google Drive

Step 4. Enable access to the Google Drive API for this project:

Figure 45. Connecting Google Drive

Step 5. You will be prompted to Create Credentials:

Figure 46. Connecting Google Drive

Step 6. Select Google Drive API as the API you are using. You will call it from Web browser to access User data:

Figure 47. Connecting Google Drive

Step 7. Click What credentials do I need?, and select the Set up consent screen option:

Figure 48. Connecting Google Drive

Step 8. Select Internal as User Type.


Step 9. Fill out the information in the OAuth Consent Screen form.

Figure 49. Connecting Google Drive

Step 10. In the Scopes for Google APIs section, click Add scope and select the following scopes for the Google Drive API:

../auth/drive.file
../auth/drive.appdata
Figure 50. Connecting Google Drive

Save the changes.


Step 11. We must now create credentials for the OAuth application. Locate the Credentials menu item in the left sidebar, click Create Credentials and select OAuth Client ID.

Figure 51. Connecting Google Drive

Step 12. Fill out the OAuth form:

  • Application type: Web application.
  • Name: Whichever feels best for you. Workplace Search makes sense, keeps things clear.
  • Authorized redirect URIs: The redirect URIs for your Workplace Search installation.

    The redirect URIs required vary by which user interface you are using to manage Enterprise Search. Enterprise Search in Kibana and standalone Enterprise Search use different redirect URIs. See user interfaces for details on each UI.

    When using standalone Enterprise Search use the following two URLs, substituting <WS_BASE_URL> with the base URL at which Workplace Search is hosted (scheme + host, no path).

    <WS_BASE_URL>/ws/org/sources/google_drive/create
    <WS_BASE_URL>/ws/sources/google_drive/create

    Examples:

    # Deployment using a custom domain name
    https://www.example.com/ws/org/sources/google_drive/create
    https://www.example.com/ws/sources/google_drive/create
    
    # Deployment using a default Elastic Cloud domain name
    https://c3397e558e404195a982cb68e84fbb42.ent-search.us-east-1.aws.found.io:443/ws/org/sources/google_drive/create
    https://c3397e558e404195a982cb68e84fbb42.ent-search.us-east-1.aws.found.io:443/ws/sources/google_drive/create
    
    # Unsecured local development environment
    http://localhost:3002/ws/org/sources/google_drive/create
    http://localhost:3002/ws/sources/google_drive/create

    When using Enterprise Search in Kibana, use the following URL, substituting <KIBANA_BASE_URL> with the base URL of your Kibana instance. This should correspond with the value of kibana.external_url in your enterprise-search.yml:

    <KIBANA_BASE_URL>/app/enterprise_search/workplace_search/sources/added

    Examples:

    # Deployment using a custom domain name for Kibana
    https://www.example.com/app/enterprise_search/workplace_search/sources/added
    
    # Deployment using a default Elastic Cloud domain name for Kibana
    https://c3397e558e404195a982cb68e84fbb42.kb.us-east-1.aws.found.io:443/app/enterprise_search/workplace_search/sources/added
    
    # Unsecured local Kibana environment
    http://localhost:5601/app/enterprise_search/workplace_search/sources/added
Figure 52. Connecting Google Drive

Step 13. Upon submission, you will be presented with a Client ID and Client Secret. Keep them handy, as we’ll need them in a moment.


Step 14. From the Workplace Search administrative dashboard’s Sources area, locate Google Drive, click Configure and provide both the Client ID and Client Secret.

Voilà! The Google Drive connector is now configured, and ready to be used to synchronize content. In order to capture data, you must now connect a Google Drive instance with the adequate authentication credentials.

Connecting Google Drive to Workplace Searchedit

Once the Google Drive connector has been configured, you may connect a Google Drive instance to your organization.


Step 1. Head to your organization’s Workplace Search administrative dashboard, and locate the Sources tab.


Step 2. Click Add a new source.


Step 3. Select Google Drive in the Configured Sources list, and follow the Google Drive authentication flow as presented.


Step 4. Upon the successful authentication flow, you will be redirected to Workplace Search.

Google Drive content will now be captured and will be ready for search gradually as it is synced. Once successfully configured and connected, the Google Drive synchronization automatically occurs every 2 hours.

Document-level permissionsedit

You can synchronize document access permissions from Google Drive to Workplace Search. This will ensure the right people see the right documents.

See Document-level permissions for Google.

Limiting the content to be indexededit

If you don’t need to index all the available content, you can specify the indexing rules via the API. This will help shorten indexing times and limit the size of the index. See Customizing indexing. For Google Drive, applicable rule types would be path_template and file_extension.

There are two good things to keep in mind when writing indexing rules for a Google Drive source:

  • Google Drive native files (Google Docs, Google Sheets, etc) don’t specify an extension, so extension-based rules will never match these.
  • Document paths will start with the name of the shared drive they exist in. Documents shared from a user’s private drive will appear to have incomplete paths, as the document path can only be resolved as far as the directory structure was shared to the content source’s connecting user.

Synchronized fieldsedit

The following table lists the fields synchronized from the connected source to Workplace Search. The attributes in the table apply to the default search application, as follows:

  • Display name - The label used when displayed in the UI
  • Field name - The name of the underlying field attribute
  • Faceted filter - whether the field is a faceted filter by default, or can be enabled (see also: Customizing filters)
  • Automatic query refinement preceding phrases - The default list of phrases that must precede a value of this field in a search query in order to automatically trigger query refinement. If "None," a value from this field may trigger refinement regardless of where it is found in the query string. If '', a value from this field must be the first token(s) in the query string. If N.A., automatic query refinement is not available for this field by default. All fields that have a faceted filter (default or configurable) can also be configured for automatic query refinement; see also Update a content source, Get a content source’s automatic query refinement details and Customizing filters.
Display name Field name Faceted filter Automatic query refinement preceding phrases

Id

id

No

N.A.

URL

url

No

N.A.

Title

title

No

N.A.

Type

type

Default

None

Created at

created_at

No

N.A.

Created by

created_by

Configurable

[creator is, created by, edited by, modified by]

Created by email

created_by_email

Configurable

[creator is, created by, edited by, modified by]

Updated at

updated_at

No

N.A.

Last updated

last_updated

No

N.A.

Viewed by me at

viewed_by_me_at

No

N.A.

Updated by me at

updated_by_me_at

No

N.A.

Updated by

updated_by

Configurable

[edited by, updated by, modified by]

Updated by email

updated_by_email

Configurable

[edited by, updated by, modified by]

Updated by photo url

updated_by_photo url

No

N.A.

Shared by

shared_by

Configurable

[from, shared by]

Shared by email

shared_by_email

Configurable

[from, shared by]

Shared by photo url

shared_by_photo url

No

N.A.

Size

size

No

N.A.

Author

author

Configurable

N.A.

Media type

mime_type

Default

None

Extension

extension

Default

None

Starred

starred

Configurable

N.A.

Path

path

No

N.A.