 |
|
|
Instant Low Fruit: How Corzen Builds
Profitable Professional Content from Web Mining |
|
|
|
|
|
|
|
22 November 2004 |
|
|
|
|
As publishers squirm to find new ways to leverage value out
of long-established databases, a coterie of young companies
is harvesting new data from the Web and coming up with
highly targeted content products that are more about
publishing than they are about the leading-edge technology
that drives them. New York-based Corzen is one of this new
breed that has concentrated on statistical analyses of Web
sites used for job postings and for selling autos.
Inventing new kinds of content from the "thin air" of the
Web is an increasingly attractive business model for
companies with knowledge of specific sectors' needs and
access to highly affordable content development tools.
That's great for a small company like Corzen - and
something to think about for the bigger folks in the
electronic publishing world. |
|
Developing
databases for publishing is a great business when you can
find fresh and unique content, but for the most part the
low-lying fruit for database publishers was picked clean ages
ago. Yet there are companies that seem to have solved the
low-lying fruit problem for database publishing with an
ingeniously simple method: find new trees to harvest. Call it
Web mining, data harvesting or what have you, one of the
hottest areas in content technologies today is trying to filter
out kernels of data and meaning from the planet's largest
database - the Web. Some of the early efforts such as
Eliyon
have yielded pretty good fruit. Eliyon's culling of contact and
profile information on professionals from raw Web pages has
produced about 22 million entries, turning it into an
increasingly trusted premium source for major corporations and
content aggregators alike. Technology is a big part of Eliyon's
success but its real win was to apply its technology
successfully to a publishing problem that nobody else had yet
defined or considered at the time of its inception: use the Web
as the primary source for a marketable database of contacts.
But what if one could develop altogether
new content from looking at Web pages? This bit of alchemy is
the specialty of
Corzen, a New York-based company that is focusing on
very specific opportunities for developing valuable
content by analyzing content from fragmented sources on the
Web. It's first two primary targets have been statistics culled
from cruising the Web sites of online job posting services and
stats on online auto marts and auto dealer Web sites. Want to
know whether job postings are up or down for engineers in
Cleveland? Corzen's got it. Want to know which online service
is used the most to sell Buicks in Philadelphia? Corzen has
that too. Like Eliyon the Web scraping technology that powers
Corzen's products is only the means to a publishing ends.
Corzen spends a goodly amount of time understanding a
marketplace and its information needs and then goes out to
develop an information product with technology and offshore
data cleansing talent that can pull information from the
markets that it is targeting. Once developed and introduced,
the uniqueness and immediate value of Corzen's data products
wins the "crawlees" over for subscriptions and offers lots of
upside value for other market participants trying to analyze
the impact of ecommerce on specific markets. Financial analysts
who are aware that newspaper job listings represent a dwindling
percentage of advertised jobs, for example, are likely to find
Corzen's unique data very useful.
While there are aspects to Corzen's
Silicon Alley-style operations that look a lot like a young
technology company it's really at its heart a database
publishing company, having defined some pleasant groves of
content to harvest with little or no immediate competition.
There are relatively low barriers to entry for this kind of
operation - Web mining technologies abound and there are plenty
of offshore operations champing at the bit to use them - but
Corzen's deep knowledge and awareness of specific markets and
their content needs is likely to give it a highly targeted
level of expertise which should keep it ahead of the pack in
their chosen markets. What lessons can be learned from this
lean and quick content development player? Here are a few
thoughts for the moment:
- Knowing your content market is more
important than the technology that services it. A lot of
investment money is chasing generalist content technology
that may or may not have a chance to take hold in any
specific marketplace. Companies like Corzen that focus on a
specific business problem to be solved with content in
specific markets are likely to succeed far more often than
companies that come up with fleeting advantages in technology
vision that waste a lot of time and money trying to find the
markets that they will service best. If you have a good idea
for a Web-generated content product, develop the content idea
before the technology.
- Being a publisher may not sound
sexy, but it can pay the bills. Time and again technology
companies sprout up and promise productivity gains that will
justify big sales tags for enterprise clients, gains which
generally hit human and technology limits long before any
return on investment is reached. Concentrating on the
end-result - valuable content - is a formula that's working
increasingly well for publishing companies developing
workflow tools and online sites. There will always be market
niches where clients will pay big bucks for proprietary
content technology advantages, but increasingly it's a
publishing licensing model that drives the sales what will
keep your efforts growing.
- Leverage human talent before machine talent in early
stage content product development. For better or worse
the era of offshore human resources is now a permanent part
of the publishing world. Companies like Corzen can spring up
practically overnight with far more cost-effectiveness than
ever before using offshore talent and carefully selected
local market talent to develop credible content sources. In
this environment having the perfect technology for machine
efficiency is generally not as important in the beginning of
a content product development process as having an
affordable staff that can react quickly to changing content
product development requirements and get product to market as
quickly as possible.
Having great processing algorithms is important for tuning a
process, but when inventing content out of "thin air" it's
oftentimes the human mind's processing that will be the
difference in early sales success. There are numerous
opportunities out there for companies to take Corzen's approach
to content development and bring along new ideas for highly
valued content that leverage off of new content sources not
managed traditionally by publishers or aggregators. Database
publishers are wise to tune their existing models to reach new
standards of content value within those models, but they should
also consider how best to reach lower to invent new sources
that might also provide high value to their client bases.
-
John Blossom
To top of page
 |