APIs
Welcome to the ContentGems API V1 documentation
Follow these steps to get started with the ContentGems API:
- Sign up for a ContentGems account.
- Contact us to enable API access.
- Set up your Filters and Feed Collections in the ContentGems web app.
- Use the ContentGems API to get articles in JSON format.
You can confirm that the API is operational by using the Ping endpoint.
API Highlights
- API type: REST (JSON via HTTPS).
- Authentication: Via HTTP Basic Authentication.
- Rate limits: This is up to you since we charge for every request. Typical usage would be 1-50 requests per day and Interest.
- Security: All requests are performed via SSL. We properly encrypt all credentials on the wire and in the DB. We filter credentials so they never get stored in our server logs. Our firewalls are locked down tightly.
API Concepts
API Status
API Authentication
You can use HTTP Basic Authentication with your account’s API key by replacing the username with your API key and the password with an arbitrary value:
curl -u api_key:foo https://contentgems.com/api/v1/...
You can find your account’s API key in the ContentGems web app’s user profile page. Navigate there by signing in to the web app and selecting ‘Configure’ / ‘Your Profile’ in the top bar.
Make sure to keep your API key confidential, just like your password.
Domains
This endpoint allows you to retrieve the ContentGems domain_id for a given Domain. This is used for blocking articles from certain Domains.
Please follow this procedure for every Domain you want to block:
- Retrieve the domain_id for the Domain using this API endpoint.
- Add the domain_id to a list of domain_ids you want to block on your end.
- Add the domain_ids_to_exclude request parameter to your articles API requests. Use the list of domain_ids as value.
Resource URL
GET https://contentgems.com/api/v1/domains/lookup_id.json
Request Parameters
Name | Description | Optional, required, defaults |
domain_name | The name of the domain. The primary domain contentgems.com is considered different from the subdomain www.contentgems.com. This parameter is case insensitive. | Required |
Response Attributes
The response is a Domain JSON object with the following attributes:Name | Description |
domain_name | The name of the domain after it has been normalized by ContentGems. |
domain_id | The id of the domain if it exists. Otherwise null. |
error | An error message if there is a problem handling the request. |
Example
Request:GET https://contentgems.com/api/v1/domains/lookup_id.json?domain_name=ContentGems.com
Response:
{
"domain_name": "contentgems.com",
"domain_id": 962209
}
Filter Folders and Filters
Returns a list of all Filter Folders and their included Filters for the authenticated user.
Resource URL
GET https://contentgems.com/api/v1/interest_folders.json
Request Parameters
N/A
Response Attributes
The response is an array of Filter Folder objects with the following attributes:
Filter Folder Attributes:
Name | Description |
id | The Filter Folder's id. |
name | The Filter Folder's name. |
Filters | A list of Filters inside the current Filter Folder. See below for the Filters attributes. |
Name | Description |
id | The Filter's id. |
name | The Filter's name. |
query | The Filter's query. |
Example
Request:GET https://contentgems.com/api/v1/interest_folders.json
Response:
[
{
"id": 1789,
"name": "Content Marketing",
"interests": [
{
"id": 1433,
"name": "Curation",
"query": "curation \"content curation\" curate"
},
... (more interests)
]
},
... (more interest folders)
]
API Ping
Allows you to test if the API is operational at all.
Ping with text response
This is the simplest test possible to run against the ContentGems API. It doesn’t require authentication or special formats.
Example
Request:
GET https://contentgems.com/api/v1/ping
Response:
"pong” (as text)
Ping with JSON response
This test allows you to test the JSON response format.
Example
Request:
GET https://contentgems.com/api/v1/ping.json
Response:
{ "success": "pong" }
Articles
This endpoint returns the top articles for:
Make sure to review the API query documentation.
Articles for Filters
Resource URL
GET https://contentgems.com/api/v1/interests/<interest_id>/recommendations.json
Request Parameters
Name | Description | Optional, required, defaults |
interest_id | The id of the Filter. | Required |
max_items | The maximum number of articles to return. | Optional, default: 5 |
query_refinement | A string to be appended to the query defined on the interest to further refine the articles. Example: On a "Restaurants" base query, you might want to add +"San Francisco" as a refinement. Please see the query documentation for syntax and operators. Also make sure to URL encode any special characters. The earlier example would look like this: GET https://contentgems.com/api/v1/interests/123/recommendations.json?query_refinement=%2B%22San%20Francisco%22 | Optional, default: nil |
Response Attributes
The response is an array of article objects. Each article has the following attributes:Name | Description |
excerpt | The first 300 characters of the article. |
found_at | The date and time in UTC when ContentGems found the article. |
host_name | The host name of the URL. |
images | An array of image URLs. |
popularity | A measure of the article's popularity on a scale of 0 to 100.0 means not popular, 100 means extremely popular. |
title | The HTML document title of the article, or the URL if no title is present. |
url | The URL of the article. |
videos | An array of video URLs. |
word_count | The number of words in the article. |
Example
Request:GET https://contentgems.com/api/v1/interests/1433/recommendations.json
Response:
[
{
"excerpt": "Lorem ipsum ...",
"found_at": "Fri Jul 21 13:58:46 +0000 2012",
"host_name": "contentgems.com",
"images": [
{ "url": "http://static.contentgems.com/image1.jpg" },
{ "url": "http://static.contentgems.com/image2.jpg" }
],
"popularity": 42,
"title": "Accelerate your content marketing",
"url": "https://contentgems.com",
"videos": [
{ "url": "http://static.contentgems.com/video1.mp4" },
{ "url": "http://static.contentgems.com/video2.mp4" }
],
"word_count": 256
},
...
]
Articles for ad-hoc queries
Resource URL
GET https://contentgems.com/api/v1/query/recommendations.json
If you have a lot of query params, then you can also use a POST request and submit the params in the request body to avoid length restrictions on URLs for GET requests.
Request Parameters
Name | Description | Optional, required, defaults |
query | The search query as string. Please see query documentation for details.Query string example for the required phrase +"Content Marketing": query=%2B%22Content%20Marketing%22. Query string example for the two required fuzzy terms "+solar~0.3 +panels~0.3": query=%2Bsolar~0.3%20%2Bpanels~0.3 | Required |
dedup_similarity_threshold | ContentGems removes duplicate articles from the articles. It computes the Jaccard index for each article pair in the result set and groups together any articles with very high Jaccard similarity indexes, only showing the most popular one. The `dedup_similarity_threshold` determines at what Jaccard index two articles are considered duplicates. Practical values are between 0.5 (vaguely similar article body text) and 0.99 (identical article body text). | Optional, default: 0.85 |
domain_ids_to_exclude | List of domain ids to exclude from articles. No URLs from any of the excluded domains will be included in articles.Query string example: domain_ids_to_exclude[]=1&domain_ids_to_exclude[]=2. You can retrieve a Domain Name'sdomain_id using the Domains API endpoint. Note: You can also get a recommendation’s domain_id from a result in the web UI by hovering over the result‘s ‘Block Domain’ link and reading out the `i_source_id` parameter. | Optional, default: [] (empty array, no exclusions) |
feed_bundle_ids_to_exclude | List of Feed Collection ids to exclude from articles. No URLs from any of the excluded Feed Collections will be included in articles.Query string example: feed_bundle_ids_to_exclude[]=1&feed_bundle_ids_to_exclude[]=2. You can retrieve a Feed Collection'sid from the web UI: Go to "Manage Sources" / "Manage Bundles". Hover over the "Edit" link for the Collection you want and read out the number after the /i_feed_bundles/ segment in the URL. | Optional, default: [] (empty array, no exclusions) |
feed_bundle_ids_to_use | List of Feed Collection ids to limit articles to. Only URLs from the used Feed Collection will be included in articles.Query string example: feed_bundle_ids_to_use[]=1&feed_bundle_ids_to_use[]=2. You can retrieve a Feed Collection's id from the web UI: Go to "Manage Sources" / "Manage Bundles". Hover over the "Edit" link for the Collection you want and read out the number after the /i_feed_bundles/ segment in the URL. | Optional, default: [] (empty array, no limitations) |
found_within | Articles must have been found within the given time period. One of 'last_24_hours' or 'last_7_days'.Query string example: found_within=last_24_hours | Optional, default: 'last_24_hours' |
max_items | The maximum number of articles to return.Integer between 1 and 50. | Optional, default: 5 |
max_word_count | Articles must have at most this many words.Integer between 1 and 10,000. | Optional, default: Nil |
minimum_popularity | Articles must have a popularity equal to or higher than this.One of:"none", "very_low", "low", "medium", "high", or "very_high". | Optional, default: "none" |
minimum_score | Articles must have a score higher than this. Scores are computed by the indexed search engine and typically range from 0.01 to 15.Query string example: minimum_score=0.5 | Optional, default: Nil |
min_word_count | Articles’ body text must have at least this many words.Integer between 1 and 10,000. | Optional, default: Nil |
min_word_count_title | Results’ title must have at least this many words.Integer between 1 and 10,000. | Optional, default: Nil |
must_have_image | Articles must have images. Boolean true or false.Query string example: must_have_image=true | Optional, default: false |
must_have_video | Articles must have videos. Boolean true or false.Query string example: must_have_video=true | Optional, default: false |
rank_by | Articles will be ranked by this attribute before the top items are selected. One of 'relevancy' or 'popularity'.Query string example: rank_by=popularity | Optional, default: 'popularity' |
response_fields | List of response fields to return. Array containing any of the following strings:
|
Optional, default: all fields |
sort_by | Articles will be sorted by this attribute after the top items are selected. One of 'date', 'relevancy' or 'popularity'.Query string example: sort_by=date | Optional, default: 'date' |
Response Data
Contains all the fields documented under Filters, with the addition of the following:Name | Description |
score | The score computed by the indexed search engine. It typically ranges from 0.01 to 15. |
highlights | Up to 3 document fragments that match the query, with matching terms highlighted. This is useful for debugging queries, as well as providing summarized context to readers.Example highlights for the query string 'test':{ 'highlights': [ "Subject: <em>test</em> In a recent post, they report the results", "Rorschach <em>test</em> of U.S.-Turkish, with Ankara viewing them" ] } |
Example
Request:
GET https://contentgems.com/api/v1/query/recommendations.json?query=test%20query&response_fields[]=title&response_fields[]=url
Response:
Same as response data for Filters. Please see above for details.
API queries
Below is a list of ContentGems articles API query examples. You need to URL encode each plain text query before submitting it to the ContentGems API.
Description | Plain text query | URL encoded query |
Basic query: Article must contain "banana". | +banana | %2Bbanana |
Phrases: Article must contain "banana boat". | +"banana boat" | %2B%22banana%20boat%22 |
Multiple terms: Article must contain "apple" and "banana". | +banana +apple | %2Bbanana%20%2Bapple |
Should, must, must-not terms: Article should contain "apple" and/or "banana", must contain "lemon", and must not contain "pear". | apple banana +lemon -pear | apple%20banana%20%2Blemon%20-pear |
Field specifiers: Article must contain "apple" in the title and "banana" in the excerpt. | +title:apple +excerpt:banana | %2Btitle%3Aapple%20%2Bexcerpt%3Abanana |
Boolean OR: Article must contain "apple" or "banana" in the title. | +title:(apple banana) | %2Btitle%3A%28apple%20banana%29 |
Wildcards: Article must contain words that start with "banana", e.g., "banana" and "bananas". | +banana* | %2Bbanana%2A |
Boosting: Prefer articles that contain "apple" over those that contain "banana" or "pear". | apple^2 banana pear | apple%5E2%20banana%20pear |
Fuzzy matches: Match both "color" as well as "colour". | ~color0.3 | ~color0.3 |
Available fields
- title - this matches the title only
- body - this matches the entire result body text
- excerpt - this matches the text body’s first 300 characters
URL encoding
All your API queries must be URL encoded before you submit them to ContentGems. You can use an online URL encoder or use the table below for common characters:
- (space) => %20
- " (double quote) => %22
- : (colon) => %3A
- + (plus sign) => %2B
- ^ (boost operator) => %5E
- * (wildcard operator) => %2A