HR Dataset
Specifications
Release: November 14, 2024
Version: 0.1.0
Size: ~100G (.csv)
Update Freq: Monthly
Note: This file contains all people in the Universal Person contact file, with additional fields for HR-specific features, such as job history and time-in-role.
Data Schema, Counts, and Fill Rates
Feature | Description | Counts | Fill Rate |
---|---|---|---|
up_id | Universal Person ID. This is a unique identifier assigned to a person, linking all business and consumer information to a single individual. The up_id, along with the company_id, are primary keys throughout the co-op data files. | 313,252,124 | 100% |
first_name | First name of the person in this record. | 313,206,871 | 100% |
last_name | Last name of the person in this record. | 313,207,338 | 100% |
business_email | The primary business email associated with this person. | 101,268,449 | 32% |
programmatic_business_emails | This feature is an array of all common business email formats, built from the first_name, last_name, and company_domain fields in the B2B and UP files. | 154,651,483 | 49% |
mobile_phone | Mobile number uniquely associated with this person, regardless of business or personal association. | 163,275,247 | 52% |
direct_number | This is a non-mobile business or personal number for the contact. | 132,604,671 | 42% |
personal_phone | This is a non-mobile business or personal number for the contact. | 132,604,671 | 42% |
linkedin_url | The LinkedIn URL for this person. | 98,090,403 | 31% |
personal_address | Where this person lives. | 217,145,759 | 69% |
personal_address_2 | Where this person lives. | 34,217,177 | 11% |
personal_city | The city in which this person resides. | 275,214,179 | 88% |
personal_state | The upper-case standard abbreviation for the person's home state. This field will contain 50 US states + Washington D.C. Please see professional_state and company_state for other state data. | 276,885,040 | 88% |
personal_zip | The 5-digit US zip code for the person's home address. | 225,539,454 | 72% |
personal_zip4 | The 4-digit extension the person's home zip code. | 209,930,049 | 67% |
personal_emails | The primary personal email observed for this person record. | 208,735,026 | 67% |
additional_personal_emails | Any additional personal emails associated with this person, excluding any emails previously reported as invalid. Stored as a comma-separated array. | 35,932,562 | 11% |
gender | Gender of the person (M/F/U). | 259,713,360 | 83% |
age_range | The age range this contact fits within. The ranges are industry standard demographic ordinal values, and can be found here. Null values indicate unknown. | 191,771,484 | 61% |
married | If the person is married, the value will be 'Y'. Blank values represent unknown. | 139,368,516 | 44% |
children | Indicates if the person associated with this record has children. Values are 'Y', 'N', or NULL. Null values indicate unknown. | 99,556,499 | 32% |
income_range | The income of this person, mapped to industry standard demographic ranges. Link is to the mapping ranges. Null values indicate unknown. | 114,696,611 | 37% |
net_worth | Estimate of a households total financial assets minus liabilities, reported within industry standard demographic ranges. Link is the range table. Null values indicate unknown. | 137,723,940 | 44% |
homeowner | Reports if the person in this record is a homeowner. Y and N are observed values, P represents that they are likely a homeowner, based on probabilistic modeling, and null values represent 'unknown'. | 144,659,797 | 46% |
job_title | The current job title of the person in this record. | 152,444,438 | 49% |
seniority_level | Seniority derived from the job title using the Seniority Clustering / Department mapping model | 148,901,846 | 48% |
department | Department of the job_title in this person's record. | 122,837,441 | 39% |
professional_address | Where this person shows up to work - could be the same as personal, company or different depending on company structure. | 47,611,966 | 15% |
professional_address2 | Where this person shows up to work - could be the same as personal, company or different depending on company structure. | 14,993,497 | 5% |
professional_city | Where this person shows up to work - could be the same as personal, company or different depending on company structure. | 52,661,517 | 17% |
professional_state | The state occurring within the professional address - the state where this person works from. This field will contain all 50 US states, Washington DC, and the five populated US territories - American Samoa, Guam, the Northern Mariana Islands, Puerto Rico, and the U.S. Virgin Islands. Please see company_state and personal_state for other state data. | 52,770,300 | 17% |
professional_zip | The 5-digit Zip Code occuring within the professional address. | 47,517,475 | 15% |
professional_zip4 | The 4-digit extension of the professional address zip code. | 30,546,110 | 10% |
company_name | Company name associated with the person's job title. | 160,094,311 | 51% |
company_domain | The primary domain associated with this company. | 154,729,448 | 49% |
company_phone | General phone number for the company. | 147,008,663 | 47% |
company_sic | The 4 digit Standard Industrial Classification (SIC) code(s) associated with the company. Link is to SIC classification taxonomy. | 144,374,743 | 46% |
company_address | The physical address of the company headquarters. | 139,032,636 | 44% |
company_city | The city of the company headquarters. | 160,047,523 | 51% |
company_state | State code of company headquarters. This field will include 50 states, Washington DC, the five US territories (American Samoa, Guam, the Northern Mariana Islands, Puerto Rico, and the U.S. Virgin Islands), and a variety of non-US user-reported data. Please see professional_address and personal_address for additional state data. | 155,924,372 | 50% |
company_zip | Zip of company headquarters. | 144,087,523 | 46% |
company_linkedin_url | The URL for the company's LinkedIn profile. | 142,546,464 | 46% |
company_revenue | Revenue range associated with the company. This is derived from the employee count and mapped in accordance with the mapping linked here. | 162,971,254 | 52% |
company_employee_count | The number of observed US enterprise employees of this company, fit within standard firmographic ranges, which are listed in this link. | 162,971,254 | 52% |
primary_industry | The primary industry for a company, as observed in multiple firmographic data sets. | 121,003,775 | 39% |
business_email_validation_status | The validation status of the associated business email. The following values are allowed: 'valid' = the host email server does not return and 'invalid' response to an email sent to this email address. ; 'Valid-ESP' = the email validation flag was provided by an Email Service Provider (ESP); 'Valid-Digital'= the validation flag originated from cookie signal or digital tag. | 76,491,370 | 24% |
personal_emails_validation_status | The validation status of the associated personal email, 'Valid' indicates the email has been validated, a null value represents unknown. | 164,621,999 | 53% |
personal_emails_last_seen | The UNIX timestamp of the last validation observation of this personal email. via validation or other verifiable data. Null values indicate that no data is available. | 164,621,999 | 53% |
company_last_updated | This is the Unix timestamp of the last update for the company_name field in the B2B and Universal Person files. This field records when the contact record was last confirmed at the company. Note: companies may have multiple 'company_last_updated' values, reflecting the last updates across different employees. | 74,520,207 | 24% |
job_title_last_updated | The Unix timestamp of the last update for the job title. | 66,825,365 | 21% |
last_updated | The date (unix) by when a value within the record was last updated. | 252,864,619 | 81% |
work_history | This field is a JSON formatted work history, which when available, include data for: company name, position, duration, start_time, end_time, job_description, location, and social_url. | 76,682,020 | 24% |
education_history | This field is a JSON formatted education history, which when available, include data for: degree, description, start_time, end_time, institution_name, and social_url. | 43,815,159 | 14% |
cc_id | A persistent, unique ID, assigned to each company in the Co-Op Firmographic dataset. The cc_id, along with the up_id, are used as key linkage across the co-op data sets. | 154,729,448 | 49% |
company_description | This is a varchar(2000) field with the company description from online profiles. | 128,137,470 | 41% |
related_domains | These are sub-domains and top-level domains that roll to up or redirect to the company_domain field in this record. | 31,937,910 | 10% |
social_connections | The number of social media connections observed for this person, given in ranges of 1-9, 10-49, 50-99, 100-249, 250-499, 500+. | 62,559,983 | 20% |
dpv_code | The Delivery Point Confirmation Code is the primary method used by the USPS to indicate whether an address was considered deliverable or undeliverable. Y=confirmed for primary and secondary numbers, D=confirmed for primary number only, secondary number missing, S=confirmed for primary number only, secondary number present but unconfirmed, N=both primary and secondary numbers failed to confirm, Blank=no data. | 149,038,903 | 48% |
company_naics | North American Industry Classification System, denoting the primary industry(s) the company is listed in. This field is a comma-separated array with up to 5 NAICS codes. | 146,010,160 | 47% |
company_country | The ISO Alpha-2 country code of the company address. | 94,407,705 | 30% |
contact_country | The ISO Alpha-2 country code of the personal address. | 75,995,885 | 24% |
job_title_normalized | This is the cleaned up version of the job_title, incorporating spelling corrections, standardized title elements, and normalized to reduce the number of variations. E.g. - "CEO", "C.E.O.", "Chief Exec Ofc", etc, all become: "Chief Executive Officer". | 23,892,991 | 8% |
seniority_level_2 | The updated model for seniority levels, modeled off the job_title and primary_industry. See this page for additional detail: https://docs.5x5coop.com/docs/seniority-and-department-20 | 138,025,191 | 44% |
department_2 | The updated model for departments, modeled off the job_title and primary_industry. See this page for additional detail: https://docs.5x5coop.com/docs/seniority-and-department-20 | 109,927,937 | 35% |
about_me | This is an open text field that the person has used to describe themselves as a business introduction. | 23,209,733 | 7% |
also_often_viewed | Other social profiles that are viewed with this profile. This is a JSON field, containing the person, and a link to their social profile. | 70,971,257 | 23% |
number_of_jobs | Number of jobs listed in online resumes. | 75,675,231 | 24% |
patents | This is a JSON field containing a list of all patents listed for this person. | 271,576 | 0% |
time_in_role | The time in their current listed position, in months. | 50,014,151 | 16% |
Updated 18 days ago