Prestige Review

Juicy gossip stories with tabloid heat.

news

CNN - Leave it to the hound

Writer David Perry
Computing

Next-generation search tools will hunt for information based on predefined user profiles

(IDG) -- Imagine logging on to your computer and being welcomed with this message: "Good morning. Here are the briefs on your competitors I gathered for you last night."

As futuristic as this scenario might seem, it could soon be possible. Where Internet search engine vendors have faltered in creating true personalized and refined search results, intranet search tool makers are on the cusp of major breakthroughs that would enable just that.

Many intranet managers anxiously await the possibility. "Search technology will evolve into a middleware tool that becomes part of an arsenal for gathering information. It will take people out of hunt mode," says Wayne Applehans, manager of the knowledge resource strategies group at J.D. Edwards, in Denver.

Search vendors targeting the Internet have all they can handle just trying to ensure their software can deal with the World Wide Web's wild growth, says Ron Weissman, vice president of worldwide marketing for Verity in Sunnyvale, Calif. "There are 130 million Web sites on the Internet, and that number is going to grow to 1.3 billion in the next few years. It's hard to create search technology to [keep up] with that," he says.

Meanwhile, intranets, because of their controlled environment, are perfect for the new technologies that tag documents with descriptive labels, engage in searches based on user profiles and create visual maps of search results.

Capabilities such as these will go a long way toward making it easier to get information out of intranets. Intranet searching today often takes place on a piecemeal basis, as Forrester Research discovered in a survey of 50 Fortune 1,000 companies. More than 60% of the survey respondents said they can only perform searches on individual servers and not across the enterprise.

Forrester, in Cambridge, Mass., says that only 32% of intranets are completely searchable. The firm is not alone in its belief that the way to turn these numbers around is to create search tools that work in tandem with knowledge management.

J.D. Edwards, for example, is embracing that principle. The supply chain software maker is readying the rollout of the Knowledge Garden 2.0, an intranet that will serve more than 8,200 employees, partners and customers. One of the most advanced features of the Knowledge Garden will be an extensive search tool, Applehans says.

Garden of Eden?

To compile J.D. Edwards' search database, Applehans and his team created profiles of the company's 4,200 employees. These profiles included name, job title and job needs, such as information on competitors and clients.

Employees ranked the value of the information, and Applehan's team categorized the data based on that criteria. "Buying a million-dollar search engine doesn't help if you don't organize people around the information," Applehans says.

The team used Microsoft's Site Server 3.0, which includes a tool that lets you create a site vocabulary and then allows you to automatically mark those words within documents using meta tags. Intranet managers create query forms for users to check off which resources, such as file or Web servers, they want searched, Applehans says.

Site Server 3.0 allows searches of Office documents, Web pages, Active Server Pages (ASP) and text files. Upcoming versions of SQL Server and NT will feature built-in, full-text search.

"The object is to make sure that everything within the enterprise is searchable," says Mike Tuchen, Microsoft's group program manager for Site Server. "You also have to make sure that a search user interface is built into everything."

At J.D. Edwards, departmental knowledge authors insert meta tags in documents and then build an abstract on each document for their groups. Pull-down menus within Site Server help this task so only the meta tags that apply to a knowledge author's department appear, Applehans says.

A knowledge resource analyst validates the information submitted by the knowledge author and then checks the document into the Knowledge Garden database. Applehans says upcoming versions of the Knowledge Garden might feature the Extensible Markup Language (XML), a new World Wide Web Consortium standard for tagging Web content. The standard lets users create fields that name data; search engines use those fields to create more accurate return lists.

The Right Prescription

There's a tremendous coalition around XML, but Verity's Weissman says the standard is about two to three years away from becoming the next big thing in intranet searching.

However, intranet managers who want XML now don't have to wait. Start-up Centraal for example, offers an XML search tool for the Internet that companies can use internally as a browser plug-in.

Centraal's technology, called Real Name System (RNS), allows companies to attach keywords to URLs so that typing in a simple word, not the whole name string, produces a document. For instance, instead of typing in "" a user could simply program his search tool to look for "Fusion." Because the page has been tagged with that name, the search tool will link directly to it, explains Keith Teare, president and CEO of the Palo Alto, Calif., company.

Centraal plans on bolstering RNS with a centralized management feature, Teare says. Currently, users have to install the product as well as the predefined lists at their desktops. This means someone has to walk each addition or deletion to the list to every computer.

Walgreens, the nationwide pharmacy chain headquartered in Deerfield, Ill., hopes to have XML up and running within 12 months, says Pete Van Valin, team lead on Web systems at the company.

Van Valin has been trying to make Walgreens employees understand the importance of placing meta tags in their HTML documents, and says he's looking forward to automating the process with XML. He's still evaluating tools, however, so he doesn't have specific plans.

Trying to get hold of all the information generated from the intranet's 10,000 users would be impossible without tagging, Van Valin says. His team has created a list of standards and best practices for intranet documents but has not done formal training on tagging yet.

"Tagging is all about precision. When that precision happens, it's ideal," Microsoft's Tuchen says. Microsoft recently announced support for the XML standard and is implementing support for the standard in products such as its Internet Explorer 4.0 browser.

Doing the math

While vendors such as Centraal favor XML-based keyword lists, other search tool makers shy away from them. San Francisco-based Autonomy, for example, opts for Bayesian logic to track word patterns in documents.

Thomas Bayes, the 18th century minister and mathematician, examined the relationship between multiple variables and determined the extent to which one variable affects the other. Applied to search, this means that rather than searching for individual words, Autonomy's engine examines patterns of words within documents, marking their occurrence together. For instance, if a user wants to search for information on Microsoft's Wolfpack, he won't receive information about wolves in the wild. Because of the user's marked pattern, Autonomy's Agentware system will know that this request is dealing with software and Microsoft.

Other companies are turning to visualization to help users understand their search options. Start-up Semio, for example, offers a search tool that indexes text, creates clusters of content and then generates visual maps of those clusters for search results.

If a user searches on "NT," SemioMap 2.0 will display the returns that directly pertain to that result, and then map out related concepts such as Windows, Microsoft and operating systems. This gives users a sense of the hierarchy of their searches.

SEARCH no more Searching as a stand-alone tool will eventually become obsolete, J.D. Edwards' Applehans says. Instead, search will be embedded into other systems as is planned for NT 5.0 and Lotus Notes 5.0. Rather than having people spend time tracking down information, next-generation search tools will proactively and transparently gather information for users based on predefined profiles and job titles, he says.

Applehans says four things are necessary in future search tools:

  • They'll have to become more user-friendly and simplify search. (Boolean searches alone will no longer be acceptable.)
  • They'll have to gather and process external information as well as they gather internal data.
  • Embedded agents will have to analyze search paths and offer other ways to find information. They'll also have to study data-gathering behavior, store that information and be able to build suggested query lists based on that information.
  • They'll have to incorporate standards such as XML.

Arriving at such a searchable universe is going to cost, of course.

According to Forrester, implementing a full-blown search tool installation for a 20,000-person company would cost about $234,000. That figure takes into account software, servers, initial deployment design and one-year maintenance.

And that could be peanuts compared to the cost of implementing a knowledge management system from a company such as Verity.

Such an installation would tally from $300,000 to $1.2 million, not including the salaries of the people who will be ensuring that information is properly tagged and summarized, Forrester says.

For its part, J.D. Edwards spent slightly more than $600,000 to build Knowledge Garden. That figure includes the cost of software, hardware, maintenance, personnel, consulting and training, Applehans says.

J.D. Edwards expects a big payoff. The company estimates that it will save about $4 million annually by enabling employees and others to search for information through the Knowledge Garden, says Applehans, noting that the firm expects a three-year return on investment of 1,811% for the intranet. You'd have to search long and hard to find fault with returns like that.