eDiscovery Overview in SharePoint 2013

March 12, 2013



By Pete DiChiara


One of the lesser-hyped features in SharePoint 2013 is the eDiscovery functionality and the improvements made to it. For those not familiar, eDiscovery is the method followed for preserving electronic documents and communication for use in the legal system’s discovery process. Given that SharePoint (and while not the topic of this post, an Exchange server) is generally the central repository of information for businesses that use it, this is an important feature to have configured and ready in case of a legal event. Before we get into how it helps, something to note is the eDiscovery features are only available in the Enterprise license version of SharePoint.

SharePoint 2013 can make most stages of the electronic discovery process easier outside of managing information day to day.

Identification: Finding appropriate documents via creating eDiscovery sets from specific predefined sources (SP sites or Exchange servers).

Preservation: Enabling in-place holds on specific or all documents and information on the SP site without moving them to a new repository or preventing users from modifying and otherwise using the file.

Collection: New querying abilities allow for groups of files to be identified quickly and succinctly.

Processing: The ability to export the results of the queries as .csv/.mht/.pst files or the documents themselves for later import into review tools.

So how does this work?

A broad overview of the process: permissions are set to access crawls, which are set up to run on content sources, and each site, site collection, or service application are possible levels for sources.

The first aspect to consider before setting up eDiscovery is to determine how many eDiscovery Centers you’ll need to have. Centers are the hubs at which content is aggregated and queried. The number of Centers to have is directly tied to the number of Search service Applications crawling the site. If one crawls the whole site, there will be only one Center. Likewise, if there are multiple that crawl different subsets or parts of the site then a Center will be needed for each crawl. This will play into permissions settings in a bit.

At its core, the eDiscovery features are an extension of the search capabilities of SharePoint. The primary way to configure eDiscovery is to set up search crawls to account for all the documents and information on the site that need to be discoverable. This includes all file shares (name length under 259 characters!) that might be in use by the site.

Before setting up crawls, content sources need to be set up. Site and site collection sources are configured on the actual site via Site Settings while service application sources are set up in Central Administration. Different permissions are needed to configure the sources depending on what level they are at: Service application requires Search service application admin, site collection requires site collection admin, and site requires site owner permissions. Result sources can be a complex beast, but for specific steps on setting up different sources see this TechNet article.

Once sources have been specified, search crawls need to be configured to index content on the sources. Even though the different sources required different levels of permission to access and configure, all crawls are set up in Central Administration. Setting up crawls on a content source is in a new section on the left navigation under Search Service Application > Crawling titled (oddly enough) Content Sources. Just like with any search crawl, once it’s set up make sure that the crawl has run successfully. At this point eDiscovery is enabled on a SharePoint content source.

After configuring sources and crawls on the content, permissions must be set for users of the eDiscovery tool to be able to actually use the Centers. Microsoft’s recommended practice is to place all users into a single group for each eDiscovery Center. Users of this group need to not only have access to the eDiscovery Center site collection, but they also must have access to all content that is crawled and displayed in the Center. This includes file shares and any other servers (Exchange/Lync/etc.) that might be crawled. The fact that a user must have access to all the content of a Center is where the separation between crawls is important if there is no one user or group of users that should have such high permissions.

Finally, after all configuration is done, create an eDiscovery Center. Centers are a type of site collection and are created via the same method as creating any other site collection (Central Administration > Application Management > Create Site Collection) and selecting eDiscovery Center in the Enterprise section. That should get SharePoint set up to handle eDiscovery cases and setting content holds.