The International Journal 
of Newspaper Technology

Home  | Newspapers & Technology | Prepress Technology | Online Technology | IFRA/International News
 | Free Subscription | Contact Us | Newspaper Links | Trade Show Listing |

        

 December
 2002


Atomz
650.244.1400
www.atomz.com

Convera
703.761.3700
www.convera.com

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


 

 

 


 

 

 

 

 

 

 



 











 



 

 

I still haven't found what I'm looking for...
Search engine technology works both ways

By Hays Goodman
Associate Editor


Search engine technology functions for a newspaper Web site in one of two ways: externally and internally. Externally, newspapers rely on consumer search engines such as Yahoo!, Google and AltaVista for a portion of their Internet traffic.

Though most people will certainly know to type in a popular newspaper’s name to access their site, like dallasmorningnews.com, frequently users will return to retrieve a newspaper’s individual articles. If a paper makes thousands of articles available to a consumer search engine, then the potential reach of the Web site expands in a similar fashion, since the odds for matches and hits on a given search phrase also grow.

As reported on the site searchenginewatch.com, two major Internet traffic ratings sites, NetRatings and Jupiter Media Metrix, have a consensus over the most popular search engines and their slice of traffic. In terms of Web audience reach, the top three are MSN (between 30 percent and 37 percent), Yahoo! (between 30 percent and 34 percent) and Google (between 28 percent and 30 percent).

According to market research firm StatMarket, about 52 percent of Internet users found Web sites via direct navigation or bookmarks. Random surfing accounted for 41 percent and search engines contributed between 7 percent and 8 percent of traffic to a site. However, that statistic doesn’t reflect the importance of finding a site initially with a search engine. Frequently, once a site address is found, it is bookmarked by the user or remembered so the engine isn’t used a second time for that particular site. Search engines in that case serve as discovery tools.

Two years ago, Yahoo! and AltaVista, at that time the two most popular search sites, went to a paid model that offered customers guaranteed placement of search results in their indexes.

Such guaranteed placement can mean a lot of money for search engines. Yahoo!, for example, charges $299.99 for its “site express” service that guarantees evaluation of a site by human eyes, but does not guarantee placement in its directory. That fee is re-applied every year for continued placement.

AltaVista has a similar, less-costly plan: $39 per URL for a single site, and then a sliding scale for multiple URLs for a six-month inclusion in its index. Google uses a different approach whereby listings are determined by a relative rank of their popularity: If a site has a large amount of other sites pointing to it as links, and also has text that closely matches the query, then it ranks high in Google’s directory. Google does not accept paid placement, but does accept advertising that will be generated depending on specific search terms that are used by consumers.

On the other side of the firewall, search technology also serves self-contained sites. News sites in particular can benefit from search engines that send software “agents” that crawl through their daily content and their archives. Most search engines will license their technology to individual sites, and in fact rely on that as their primary business model now that the advertising market for portals has declined.

Search technology can also be outsourced, as is the case with Atomz Corp. Atomz search currently powers more than 50,000 sites including CBS News, Palm, and Macromedia. With outsourced search, no software needs to be installed on the customer’s Web server. The site is updated at regular intervals and re-indexed remotely.

One Atomz user is Cincinnati.com, a six-year-old Web site that houses a variety of media properties including The Cincinnati Enquirer and The Cincinnati Post (combined daily, 197,399; Sunday, 310,673).

“We were using Netscape on our own servers … it crashed constantly and even when it was running it was feature-poor and high-maintenance,” said Jeffrey Tindall, a senior developer who helped craft the original site.

The crashes resulted in lost data, requiring lengthy re-indexing.

“We also tried Excite with much the same results,” Tindall said. “We wanted to avoid administering it in-house if possible, so our administrative staff could focus their efforts on development rather than upkeep.”

Atomz, Tindall said, helped alleviate almost all of the site’s administration challenges.

“In fact, when Gannett corporate decided to pursue a solution, I strongly recommended Atomz to them.”

Gannett subsequently signed a two-year agreement with Atomz to provide search tool software for all of Gannett’s newspaper Web sites. Today Atomz is being used by a number of Gannett’s newspapers, including The (Nashville) Tennessean and The Argus Leader in Sioux Falls, S.D.

According to Tindall, Atomz monitors inbound traffic from the referring URLs very closely, and can thus monitor how many referrals are being generated by various search engines.

“Search engine referrers contribute a very high portion of our visitors, which is true of all newspaper sites. But we take great pains to maximize this as much as possible,” Tindall said.

The Deseret News (daily, 67,000; Sunday, 70,000) in Salt Lake City chose to keep its search engine in-house, integrated with its archive system. Its site serves the Salt Lake City and Provo metro areas. The daily has been using Convera’s RetrievalWare for its search engine since 1998.

“We looked through trade magazines, talked to other newspapers, actually looked at the solicitations coming in, and compiled a list of possible vendors,” said Dave Schneider, director of new media. “In ’98 and ’99 [application service providers] really weren’t the considerations they became later. We had, and have, a good systems/IT department and editorial-content projects are high priority with them. So going out to ASPs for what we consider an editorial-content service isn’t likely to happen.”

RetrievalWare creates an inventory of all enterprise assets, then enables users to search more than 200 document types on file servers, in groupware systems, relational databases, document management systems and Web servers. This access is managed by RetrievalWare’s synchronizers, which recognize any changes system-wide and automatically update the RetrievalWare index.

Search engines, both internal and external, are clearly an important component to a successful Web site. Five years ago they were often seen as an exciting option. Today, according to many experts, they’re seen as a standard operating function of a well-constructed site.