Understanding the Search Architecture of SharePoint 2013 is very much important to develop any SharePoint search based application. There are various components that make up the SharePoint Search Architecture. In the below diagram, we can see components that are highlighted in blue and red. These blue blocks are the components that are pre-built and part of SharePoint search architecture and the red components are the extensibility options available for developers.
Crawling Process
The crawling architecture consists of
1. Content Sources – Repository
2. Connectors and Parsers
Content Sources – Repository
Content sources are the types of repositories such as File shares, Profiles, SharePoint, Exchange, Documentum and so on, that we would want to index using SharePoint search.
custom content source at the bottom indicated in red is for an extensibility point for developers. The extensibility point referring to the fact that we can create Business Connectivity Services (BCS) connectors that can connect to any repository.
Business Connectivity Services (BCS) is an important companion in the overall search architecture because it allows us to connect to various content sources.
Connectors and Parsers
Two things are needed for any search engine to do successful search. The first is, it must be able to connect to that repository. The second is it must be able to gain access to the items that are in that repository and look through them for indexing.
For example, if we were crawling a document repository. First we should be able to gain access to that repository, like a file system or a document. And then secondly we have to be able to retrieve the documents that we find there and work our way through them in order to index them. So, that when people run keyword searches. They’ll search the body of the document and all its metadata.
Connector components – are responsible for allowing you access into the repository.
Parser components – are responsible for getting the individual items within the repository and parsing them so that search can index the things that are found there.
Content Process
The content processing architecture is accountable for receiving information from a crawling architecture and then building up the index and managed properties. In the diagram, content pipeline represents the set of components that are managing the metadata that’s coming in from the external repositories.
Content pipeline has an extensibility point which is the web service call out. The web service call out allows us to create a custom web service. The content pipeline feeds the indexing engine, which builds up the index. An extensibility point search schema, has been used for the idea that we can create our own managed properties. To create aliases and set properties for our managed properties. All of this was discussed when we were talking about the Keyword Query Language – KQL (in my previous article).
Query Architecture
The query architecture contains of query engine and query pipeline. The query engine runs queries against the index. And the query pipeline simply represents all the components that process the inputs from the end users that come in to generate those queries.
The search center and the topic pages are out of the box solutions, but they are using the same API extensibility points (REST or CSOM) that use to create custom solutions.
No code solution available at the bottom is about extending the search center. Microsoft has done a great job with this version of search is giving us ways to create powerful search solutions without writing any code and don’t require to open Visual Studio.
Managing SharePoint Search
Search Service Application has been primarily used to manage SharePoint search. You can go to Central Admin -> Manage Service Application -> Search Service Application -> Manage (from Top ribbon)
In the Search Administration page, we can access various details on Crawling, Queries and Results and so on.
Happy Coding
Ahamed
Leave a comment