Planning for Search and Profile Crawl Schedules
Stuff that could make a difference:
- For Incremental crawls (of regular SharePoint content)
- Farm architecture
- Is the indexer (and/or the target server) host separated or doing other tasks? If it’s doing other tasks, you must take into account the kind of load it’s under and when that load is high and low.
- Is the database host(s) scaled in a way that running crawls don’t impact user traffic?
- What’s your “Indexer Performance” level set to?
- How fast does your indexer crawl? (i.e. items per hour or items per day)
- Content
- How many items are currently (or at the time of go-live) in your index?
- How frequent does it change?
- What kind of content is it? Big pdfs and Word docs? SharePoint lists and news posts only?
- What’s the anticipated growth rate of the content?
- Are there any documented requirements for search freshness?
- Farm architecture
- For Full crawls (of regular SharePoint content)
- Pretty much the same stuff as above
- Can you get away with doing incremental only? (See this TechNet page in the “Reasons to do a full crawl” section for info about when you must use full crawls instead of incrementals.)
- For profile import
- Number of exposed profiles
- Configuration of Mysites and number of mapped properties
- Can AD handle the load?
- Is the SSP server doing other roles?
- Profile freshness requirements
Real life example:
The my current client projecthas pretty fast servers, 14+ million items in index, 40,000 users, and 4 WFEs. We would be doing incremental crawls only if we can get around a technical issue. Full crawls take about 4 days. (a bit over 4 million items crawled per day) Until the problem is fixed, we have been doing full crawls once a week but we really can’t keep it up. Incremental crawls happen nightly and are most often done before the work day begins. (This is unless users used Explorer View to move or change millions of items the previous day.)
As for profile imports, since MySites are not exposed for all 40,000 users, profile imports happen very fast. Incrementals happen in under 20 minutes and fulls finish in a couple hours on Sunday mornings. However, this summer we will expose SP 2010 mysites for all 40,000 users with some fairly moderate custom mappings. When that happens we will be putting into some serious tuning focus to make sure that it happens efficiently.
Here and here are some good tuning tips for MOSS… most are probably relevant for SP 2010 also. Also, the “Inside the Index and Search Engines: Microsoft Office SharePoint Server 2007” book has been pretty helpful.
I know a very talented and successful developer who definitely makes six figures in annual salary. But every now and again, he’ll have a really tough day and he’ll turn to me and say, “I think I’ll quit my job and open up a hotdog stand.”

Get Firefox 3
Ultimate Boot CD