Content Sources Overview


Content Sources Overviewedit

Workplace Search can ingest data from many different content sources. A content source is usually a third-party service like GitHub, Google Drive, or Dropbox. You can also build your own connectors using Custom API sources, which allows you to create unique content repositories on the platform and send any data to Workplace Search via uniquely identifiable endpoints.

Organization content sources vs private content sourcesedit

An administrator can set up sources that the whole organization can use. These are called organization sources. Conversely, if the private sources feature is enabled, individual users can set up their own sources, accessible only to them at any point in time. Learn more about Enabling private content sources.

Standard content sources vs remote content sourcesedit

Standard content sources store all documents visible to a user on disk for every new individual connection. For example, if three users elect to connect Google Drive as a private content source, every synchronization process will create a separate repository (index), and documents with shared access will be duplicated.

Remote content sources rely on the data source’s search endpoint to retrieve information, and limited information is stored on disk. Remote content sources have a limited impact on deployment size compared to standard content sources. For example, when connecting Slack as a private content source, messages and files are not synchronized and directly stored (all the while searchable), but small amounts of metadata such as channel names are retrieved and kept on disk. This allows you to enable personalized search at scale all the while keeping the size of your deployment in check.

Read more in Enabling private sources.

Configuring First-Party Content Sourcesedit

In order to connect and synchronize data from content sources like Dropbox, an administrator must configure their respective connector with valid OAuth information, as required. From the administrative dashboard, navigate to the Settings area, and Content Source Connectors:

Figure 5. Configuring content source connectors

Each content source has its own guide to help you get started:

Connecting a Content Sourceedit

Once you have properly configured the connector by providing OAuth information for any content source, it is now ready to be added (also known as connecting) to your organization. Once connected, data, documents and other relevant information will be synchronized to your organization’s search experience.

From the administrative dashboard, navigate to Sources:

A picture of the sidebar within the administrative dashboard. The item
Figure 6. Content sources

Select a content source, click Add, and grant access to Workplace Search by following the instructions as provided by the third-party service:

A list of all the available content sources. Custom API has a blue sphere.
Figure 7. Adding a source

That’s it! You may now assign groups. Read more within the Group management.

Synchronizing document-level permissionsedit

Workplace Search can synchronize access permissions between a selection of content sources. This allows you to inherit and apply document access permissions from your connecting source, like Google Drive and OneDrive, to a Workplace Search user. Read more about Permissions & access control.

Customizing filtersedit

You can enable faceted fields and hide default facets on content sources. These changes will affect the native search experience provided with Workplace Search. You can further customize filtering behaviors with automatic query refinement, which will impact both the native search experience as well as the Search API. Read more about Customizing filters.

Configuration settingsedit

You can customize some content source behavior using Enterprise Search configuration settings. See Configuration in the Enterprise Search documentation. All settings that begin with workplace_search.content_source affect the behavior of Workplace Search content sources.