Planning for Search and Profile Crawl Schedules

By PlateSpinner, June 2, 2010 3:50 pm

Stuff that could make a difference:

  • For Incremental crawls (of regular SharePoint content)
    • Farm architecture
      • Is the indexer (and/or the target server) host separated or doing other tasks?  If it’s doing other tasks, you must take into account the kind of load it’s under and when that load is high and low.
      • Is the database host(s) scaled in a way that running crawls don’t impact user traffic?
      • What’s your “Indexer Performance” level set to?
      • How fast does your indexer crawl?  (i.e. items per hour or items per day)
    • Content
      • How many items are currently (or at the time of go-live) in your index?
      • How frequent does it change?
      • What kind of content is it?  Big pdfs and Word docs?  SharePoint lists and news posts only?
      • What’s the anticipated growth rate of the content?
      • Are there any documented requirements for search freshness?
  • For Full crawls (of regular SharePoint content)
    • Pretty much the same stuff as above
    • Can you get away with doing incremental only?  (See this TechNet page in the “Reasons to do a full crawl” section for info about when you must use full crawls instead of incrementals.)
  • For profile import
    • Number of exposed profiles
    • Configuration of Mysites and number of mapped properties
    • Can AD handle the load?
    • Is the SSP server doing other roles?
    • Profile freshness requirements

Real life example:

The my current client projecthas pretty fast servers, 14+ million items in index, 40,000 users, and 4 WFEs.  We would be doing incremental crawls only if we can get around a technical issue.  Full crawls take about 4 days. (a bit over 4 million items crawled per day)  Until the problem is fixed, we have been doing full crawls once a week but we really can’t keep it up.  Incremental crawls happen nightly and are most often done before the work day begins. (This is unless users used Explorer View to move or change millions of items the previous day.)

As for profile imports, since MySites are not exposed for all 40,000 users, profile imports happen very fast.  Incrementals happen in under 20 minutes and fulls finish in a couple hours on Sunday mornings.  However, this summer we will expose SP 2010 mysites for all 40,000 users with some fairly moderate custom mappings.  When that happens we will be putting into some serious tuning focus to make sure that it happens efficiently. 

Here and here are some good tuning tips for MOSS… most are probably relevant for SP 2010 also.  Also, the “Inside the Index and Search Engines: Microsoft Office SharePoint Server 2007” book has been pretty helpful.

Take this job and…

By PlateSpinner, March 16, 2010 6:00 pm

hotdogcartI know a very talented and successful developer who definitely makes six figures in annual salary.  But every now and again, he’ll have a really tough day and he’ll turn to me and say, “I think I’ll quit my job and open up a hotdog stand.”

I’m having one of those days.

Blue Nirvana drink recipe

By PlateSpinner, February 6, 2010 9:18 pm

With my in-law siblings, we all decided that it was no good buying Christmas gifts for each other.  Mainly because there are enough children and old people to buy gifts for.  Also, if any of us really want something, we will usually go and get it ourselves.

So we developed an institution that we have called “The Booze Exchange”.  Each household gives the ingredients and recipe for a mixed drink rather than giving some materialistic gift. 

This year, the elder in-law siblings gave us “Blue Nirvana” for The Booze Exchange.  We mixed them up tonight.  They’re very tasty.  Here’s the recipe:

 

Blue Nirvana 

Blue Nirvana

Pour 1 oz citrus-flavored vodka, 1 oz Blue Curacao and a splash of sour mix in a flute. Then fill with champagne.

Makes 1 serving

IIS WAMREG DCOM permissions for Server 2008 R2

By PlateSpinner, February 5, 2010 2:23 pm

Here’s an interesting nugget.  If you use a “least privileges approach” in setting up MOSS with the full array of service accounts (and you should), there are several permission changes that you need to make to DCOM config in order to make some event log errors go away.  This much is not news.

BUT, in R2 of Server 2008, the most important one of these is locked down and won’t let you change it.  If you try to change the “IIS WAMREG admin Service” in R2 you’ll see a grayed out screen like this:

clip_image002

Even if you’re a full admin, you’re still locked out.  It turns out you have to into regedit and give yourself permissions to the corresponding registry key just to be ABLE to modify it in DCOM.  I found full instructions to fix it up in this blog post: http://www.wictorwilen.se/Post/Fix-the-SharePoint-DCOM-10016-error-on-Windows-Server-2008-R2.aspx

I had to pass this on.  Because I we’ll be running into this a lot as we get involved with MOSS installs on Server 2008 R2 servers.

“The list is too large to save as a template.”

By PlateSpinner, November 20, 2009 1:51 pm

If you’re working with a SharePoint 2007 environment with multiple site collections but do not have access to cool tools to manage moving content around, then you may have had this problem: 

Someone needed to move a Document Library from one site collection to another one.  The only viable way to do this is to go to the library settings for that doc library and export it as a list template with content.  When I did that I received the in-SharePoint error saying:

“The list is too large to save as a template. The size of a template cannot exceed 10485760 bytes.” 

This poses a problem because the only other (non third-party tool) way to do this would involve exporting a copy of the original site and then importing it into the destination site collection so that I would be able to use Site Manager to move the list.

Well it turns out there is an undocumented property that can be modified to change this.  Run the following command to see what your farm is currently set to:

stsadm.exe –o getproperty -propertyname max-template-document-size

Chances are the reply will be “<Property Exist=No" />”, which means that you are using the default setting of 10 MB as your limit.  You can set the limit to something higher by using the “setproperty” switch.  For example:

stsadm -o setproperty -propertyname max-template-document-size –propertyvalue 500000000

would increase the limit to about 60 MB.

In my case this was really a one-time thing so, after I had created the copy of the document library, I set the size back to the original 10485760 byte value.

Troubleshoot MOSS Profile Sync Issues

By PlateSpinner, November 11, 2009 4:52 pm

There was a problem at a client site where user profiles were not getting synced on certain content databases while others were getting synced.

Out of the box, MOSS is set to run the profile sync timer job once every hour.  So I ran the following command to show me which content DBs haven’t been synced in the last day:

stsadm.exe -o sync -listolddatabases 1

On my vpc, it looks like this:

image

This looks helpful except for the fact that nobody knows what the GUID of their content DBs are.  On the client farm there some DBs with very old dates and I could tell something was wrong.  I needed to figure out what the names of the DBs with the old GUIDs are.  The easiest way to do that is to browse to Central Administration and mouse-over your content DBs.  Then you can eyeball the DB GUIDs in the status bar of your browser as it shows you the URL of the hyperlink.  It looks like this:

image

The GUID of the content DB you are pointing to starts after the “DatabaseID=%7B” in your status bar. Also take note that the “%2D” in the status bar refers to the hyphen character.  Now you know the GUID of that SharePoint Content Database.

I tried going back to the command-line to use stsadm to force it to sync.  To do that I just typed in:

stsadm.exe –o sync

I waited a while and even checked the timer job status in Central Admin and waited for it to say the Profile Sync job had completed successfully.  It turns out that this sync command only forces the “quick” version which does not go through the same job that that is scheduled for every hour.  I was disappointed to see that my content database did not get updated so now I used stsadm to wipe out my profile sync info for the databases with old data.  The following command will delete the profile sync data for databases that haven’t been synced in the last day:

stsadm.exe –o sync –deleteolddatabases

I was feeling good about myself at this point but after double-checking my work I learned that it STILL had not synced.  There was no getting around it, I had to look at the ULS logs.  When I poked around I found a line that kept showing up that alarmed me.  Referring to “SharePoint Portal Server User Profiles” the description portion of the error said something like “Aborting sweepsynch for guid instance {SomeGUID} due to null or non-online content database”.

This sounded familiar to me because I knew we had left some content databases in “offline” mode to prevent them from getting new site collections created in them (they’re already too big).  So when I looked back in Central Administration to see which databases were offline, it turned out to be every one of the ones that weren’t syncing the profiles.

This isn’t documented anywhere but it looks like content databases that are marked as “offline” in central administration will not get synced with profile sync jobs.  Once I turned them all back to “ready”, I had a very looooong running profile sync job after that and it was all clear from there on out.

What release does my SharePoint version number mean?

By PlateSpinner, July 30, 2009 8:22 am

A coworker ranted for a while about people blogging information and not updating it or simply pointing to authoritative sources. His ire was kindled recently because he was looking for documentation on what MOSS version numbers correlate to what updates.

I thought I had seen this info on Updates Resource Center page but either I was wrong or it’s gone now. I

It took me a while but I found it on technet. It’s on the “Deploy software updates for Office SharePoint Server 2007” TechNet page in the “Available Updates” section.

Not only is it there, but it appears to be updated.

Firm saves 1.8 mil/year by using MOSS Search

By PlateSpinner, June 30, 2009 10:42 am

Here’s a case study that describes an architecture firm who used MOSS to provide search for it’s 10 TB of data. That’s a freaky huge amount of data to crawl and I’m here to tell you that it’s no small thing to manage that kind of search repository.

I’m working with a farm right now that can crawl around 1,000,000 items in 24 hours. We’re working on increasing that right now by getting a faster PDF ifilter and tuning some things up but it’s only going to take us so far.

At some point soon we’re going to have to start considering some of the SSP and search wizardry mentioned in this awesome whitepaper from Microsoft IT, “SharePoint Performance Optimization

Now that SP2 for SharePoint 2007 is out…

By PlateSpinner, April 28, 2009 11:56 am

This is a great chance to clean up all of those annoying bugs and errors that you have in your SharePoint deployment.

You can get the packs here:

What’s really important is that this is a great chance for you to work some bugs out of your SharePoint deployment. SP2 combines all of the previous CU cumulative patches and adds improvement to:

  • performance
  • reliability
  • search and crawling
  • alternative browser support

Don’t forget to do regression testing with any existing custom code or 3rd party solutions before you commit to deploying into production.  Plan a relatively large maintenance window because you must deploy the pack for WSS first and then MOSS and each time you must run them in succession one after another.  For more direction on deploying SharePoint updates, see the TechNet article here.  Although it’s not updated quite yet, there will be more information about it posted soon on the "Updates Resource Center for SharePoint Products and Technologies" also found on TechNet.  And, as you might have guessed, Joel Oleson has a very useful and comprehensive post about the goodies that are baked into SP2.

Dr. Horrible will stop the world with his Freeze Ray

By PlateSpinner, July 17, 2008 7:55 am

"The thoroughbred of sin." Like any good writer would, Joss Whedon got a little bored and frustrated during the writers strike.  So, intentionally relying on the Internet as it’s medium and conduit, he has created a supervillain musical called “Dr. Horrible’s Sing-Along Blog“  With Neil Patrick Harris as “Dr. Horrible” and Nathan Fillion as “Captain Hammer”

I’ve already watched Act I and am about to watch Act II.  Act III will be released on 7/19.  In case you weren’t already convinced, I will tell you.  NPH is hysterical in this.

After reading the plan, I’m pretty intrigued by this and can’t wait to see if it takes off.  You’d better watch them all now because they will no longer be free after Sunday, 7/20.  After that it will be available for a “nominal fee”.

"You see harmless death-nerds... I see future super-villains."

Also, the little comic (a prequel of sorts) is pretty funny too. 

Check it out..

Panorama Theme by Themocracy