Add value to your news content with AP Taxonomy and Tagging

AP's industry-leading metadata is accurate, comprehensive, richly detailed data, designed specifically for use by news publishers. AP Metadata Services APIs give you direct access to the same metadata system that supports AP's award-winning global news operation.

What are the Metadata Services components?

  • News Taxonomy, a comprehensive set of standardized vocabularies built for news.
  • Tagging Service, auto-classification system that enriches your content with relevant metadata tags from the News Taxonomy.

What are the benefits?

Good metadata offers a variety of benefits, opening up new possibilities for connecting with readers and managing content:

  • Deliver targeted, relevant news products based on particular topics.
  • Create engaging search and discovery experiences for your readers.
  • Support contextual advertising.
  • Use detailed content analytics to inform editorial coverage and planning.
Quick Links

Overview
Get a quick introduction to the AP Metadata Services (PDF).

Solutions and Use Cases
Find out how you can take advantage of AP metadata to enrich your news content (PDF).

Developer's Guide
Review the complete documentation of API methods (PDF).

RDF/XML, RDF/TTL, HTML
View taxonomy data samples

RDF/XML, RDF/TTL
View tagging service data samples


Talk to us

Contact us at apmetadata@ap.org

News Taxonomy

A comprehensive classification system for English-language news content

AP News Taxonomy includes standardized subjects, planned and breaking news events, people, organizations, geographic locations and more, all designed with news content in mind. Frequent updates ensure timeliness, and a rich network of semantic relationships between concepts enables creative solutions for content linking, search and discovery.

How can I use it?

Integrate the AP News Taxonomy into your publishing systems to support manual tagging, or use it in conjunction with the AP Tagging service to apply the taxonomy to your content.

In addition to standardized terminology and unique IDs, the taxonomy stores a variety of details about the people, places, and topics it contains.

This information can power enhanced search experiences, browsing and discovery, or informational displays; for example:

  • Synonyms, acronyms, and spelling variants.
  • Properties of people, places and things – such as an athlete's uniform number, the latitude and longitude of a geographic location, or the stock ticker symbol for a company.
  • Relationships between concepts, such as between two people (Parent-Child), or between a person and a group or organization (Athlete-Team).
  • Hierarchical structure for subjects and geographic locations, to enable both broad and narrow searches.

What is included in the taxonomy?

There are five main areas of coverage:

AP Subject

A wide variety of hierarchically structured topics ranging from broad categories (Crime) to specific concepts (Illegal firearms). Also includes many named events such as Academy Awards and Tour de France. More than 4,200 terms in all.

AP Geography

Over 2,500 geographic place names arranged hierarchically – continents, world regions, countries, territories, national capitals, major world cities, US states, Canadian provinces, and a large number of US cities and towns.

AP Organization

Organizations and institutions from a wide variety of sectors: government organizations, non-profits, sports teams, colleges and universities, political and ideological groups, cultural institutions, and more. Over 2,500 different terms.

AP Person

Celebrities, artists, designers, authors, business leaders, political figures, sports figures, royalty, and other newsmakers known at the global or US national level. Coverage is especially broad for US newsmakers in politics, entertainment and sports, including complete rosters for major professional sports teams, men’s NCAA Division I basketball and football players, all US officeholders at the federal and gubernatorial levels, and all candidates for those offices. More than 160,000 individuals covered.

AP Company

Over 70,000 publicly-traded companies – including all companies with primary shares trading on any of 70 major global stock exchanges, or trading as ADRs on an American exchange.

What about updates?

The AP News Taxonomy is constantly being updated to capture the latest news and the biggest newsmakers. Whether it’s this week’s IPOs or the new crop of college athletes, AP’s taxonomy developers are always working to keep the vocabularies current and relevant.

Subscribers are kept up-to-date in real time – as soon as a change is published, the new version becomes available in the AP News Taxonomy. Numerical versioning keeps the changes organized and in sync with the data provided by the AP Tagging Service. A detailed log of all changes is accessible through a separate API. You can keep track of all changes, or just the ones you care most about.

How does it work?

Subscribers can access the taxonomy by making calls to an API and request the full set of terms in a given vocabulary, a subset of terms, or information about a particular term. Calls are also available for retrieving deprecated terms, term change logs, and additional information about the structure of the taxonomy. Taxonomy data can be returned in Semantic-Web compatible formats, such as RDF/XML or RDF/TTL. Information about a particular term can also be returned as HTML.

To learn more, refer to the Developer's Guide.

Tagging Service

Enriches your content with tags from AP News Taxonomy

AP Tagging service receives your English-language news content and automatically returns relevant metadata, using standardized terminology from the AP News Taxonomy.

This smart service goes well beyond mere text extraction; it uses human-created semantic rules to understand your content and identify the most pertinent concepts and topics.

Why human-managed rules?

Human-managed rules allow for more precise control over the service performance.

Although the system will recognize and return specific entities that it finds in the submitted content (aka "text extraction"), it also uses human-created semantic rules to identify topics that may not be explicitly mentioned in the text at all. 

For example, a story about a particular country music star can trigger the "Country music" subject, even if the word "music" does not appear in the story.

What types of metadata are returned?

Drawing from the AP News Taxonomy, the tagging service looks at each piece of submitted content and returns standardized names, IDs and other properties for all the relevant taxonomy terms that triggered a semantic rule. You can specify which of the five areas of coverage the service should return.

After all the matching metadata values have been identified, the service checks for additional standardized names and IDs based on relationships stored in the AP News Taxonomy. For example, the subject hierarchy ensures that any item tagged with "Food safety" is also tagged with "Health", and any content that picks up a sports league is tagged with the relevant sport subject.

Finally, the metadata output is enhanced with additional data properties. For example, companies are given a ticker, athletes are associated with their teams, and geographic locations get latitude and longitude data. You can also access the AP News Taxonomy for additional information about any given tag.

How does it work?

Subscribers can access the tagging service by making calls to an API. Content may be submitted as plain text or XML.

Tagging service data can be returned in a Semantic-Web compatible formats, such as RDF/XML or RDF/TTL. It can also be returned in Simple XML format.

To learn more, refer to the Developer's Guide.