HR Dataset

Specifications

Release: November 14, 2024

Version: 0.1.0

Size: ~100G (.csv)

Update Freq: Monthly

Note: This file contains all people in the Universal Person contact file, with additional fields for HR-specific features, such as job history and time-in-role.

Data Schema, Counts, and Fill Rates

FeatureDescriptionCountsFill Rate
up_idUniversal Person ID. This is a unique identifier assigned to a person, linking all business and consumer information to a single individual. The up_id, along with the company_id, are primary keys throughout the co-op data files.313,252,124100%
first_nameFirst name of the person in this record.313,206,871100%
last_nameLast name of the person in this record.313,207,338100%
business_emailThe primary business email associated with this person.101,268,44932%
programmatic_business_emailsThis feature is an array of all common business email formats, built from the first_name, last_name, and company_domain fields in the B2B and UP files.154,651,48349%
mobile_phoneMobile number uniquely associated with this person, regardless of business or personal association.163,275,24752%
direct_numberThis is a non-mobile business or personal number for the contact.132,604,67142%
personal_phoneThis is a non-mobile business or personal number for the contact.132,604,67142%
linkedin_urlThe LinkedIn URL for this person.98,090,40331%
personal_addressWhere this person lives.217,145,75969%
personal_address_2Where this person lives.34,217,17711%
personal_cityThe city in which this person resides.275,214,17988%
personal_stateThe upper-case standard abbreviation for the person's home state. This field will contain 50 US states + Washington D.C. Please see professional_state and company_state for other state data.276,885,04088%
personal_zipThe 5-digit US zip code for the person's home address.225,539,45472%
personal_zip4The 4-digit extension the person's home zip code.209,930,04967%
personal_emailsThe primary personal email observed for this person record.208,735,02667%
additional_personal_emailsAny additional personal emails associated with this person, excluding any emails previously reported as invalid. Stored as a comma-separated array.35,932,56211%
genderGender of the person (M/F/U).259,713,36083%
age_rangeThe age range this contact fits within. The ranges are industry standard demographic ordinal values, and can be found here. Null values indicate unknown.191,771,48461%
marriedIf the person is married, the value will be 'Y'. Blank values represent unknown.139,368,51644%
childrenIndicates if the person associated with this record has children. Values are 'Y', 'N', or NULL. Null values indicate unknown.99,556,49932%
income_rangeThe income of this person, mapped to industry standard demographic ranges. Link is to the mapping ranges. Null values indicate unknown.114,696,61137%
net_worthEstimate of a households total financial assets minus liabilities, reported within industry standard demographic ranges. Link is the range table. Null values indicate unknown.137,723,94044%
homeownerReports if the person in this record is a homeowner. Y and N are observed values, P represents that they are likely a homeowner, based on probabilistic modeling, and null values represent 'unknown'.144,659,79746%
job_titleThe current job title of the person in this record.152,444,43849%
seniority_levelSeniority derived from the job title using the Seniority Clustering / Department mapping model148,901,84648%
departmentDepartment of the job_title in this person's record.122,837,44139%
professional_addressWhere this person shows up to work - could be the same as personal, company or different depending on company structure.47,611,96615%
professional_address2Where this person shows up to work - could be the same as personal, company or different depending on company structure.14,993,4975%
professional_cityWhere this person shows up to work - could be the same as personal, company or different depending on company structure.52,661,51717%
professional_stateThe state occurring within the professional address - the state where this person works from. This field will contain all 50 US states, Washington DC, and the five populated US territories - American Samoa, Guam, the Northern Mariana Islands, Puerto Rico, and the U.S. Virgin Islands. Please see company_state and personal_state for other state data.52,770,30017%
professional_zipThe 5-digit Zip Code occuring within the professional address.47,517,47515%
professional_zip4The 4-digit extension of the professional address zip code.30,546,11010%
company_nameCompany name associated with the person's job title.160,094,31151%
company_domainThe primary domain associated with this company.154,729,44849%
company_phoneGeneral phone number for the company.147,008,66347%
company_sicThe 4 digit Standard Industrial Classification (SIC) code(s) associated with the company. Link is to SIC classification taxonomy.144,374,74346%
company_addressThe physical address of the company headquarters.139,032,63644%
company_cityThe city of the company headquarters.160,047,52351%
company_stateState code of company headquarters. This field will include 50 states, Washington DC, the five US territories (American Samoa, Guam, the Northern Mariana Islands, Puerto Rico, and the U.S. Virgin Islands), and a variety of non-US user-reported data. Please see professional_address and personal_address for additional state data.155,924,37250%
company_zipZip of company headquarters.144,087,52346%
company_linkedin_urlThe URL for the company's LinkedIn profile.142,546,46446%
company_revenueRevenue range associated with the company. This is derived from the employee count and mapped in accordance with the mapping linked here.162,971,25452%
company_employee_countThe number of observed US enterprise employees of this company, fit within standard firmographic ranges, which are listed in this link.162,971,25452%
primary_industryThe primary industry for a company, as observed in multiple firmographic data sets.121,003,77539%
business_email_validation_statusThe validation status of the associated business email. The following values are allowed: 'valid' = the host email server does not return and 'invalid' response to an email sent to this email address. ; 'Valid-ESP' = the email validation flag was provided by an Email Service Provider (ESP); 'Valid-Digital'= the validation flag originated from cookie signal or digital tag.76,491,37024%
personal_emails_validation_statusThe validation status of the associated personal email, 'Valid' indicates the email has been validated, a null value represents unknown.164,621,99953%
personal_emails_last_seenThe UNIX timestamp of the last validation observation of this personal email. via validation or other verifiable data. Null values indicate that no data is available.164,621,99953%
company_last_updatedThis is the Unix timestamp of the last update for the company_name field in the B2B and Universal Person files. This field records when the contact record was last confirmed at the company. Note: companies may have multiple 'company_last_updated' values, reflecting the last updates across different employees.74,520,20724%
job_title_last_updatedThe Unix timestamp of the last update for the job title.66,825,36521%
last_updatedThe date (unix) by when a value within the record was last updated.252,864,61981%
work_historyThis field is a JSON formatted work history, which when available, include data for: company name, position, duration, start_time, end_time, job_description, location, and social_url.76,682,02024%
education_historyThis field is a JSON formatted education history, which when available, include data for: degree, description, start_time, end_time, institution_name, and social_url.43,815,15914%
cc_idA persistent, unique ID, assigned to each company in the Co-Op Firmographic dataset. The cc_id, along with the up_id, are used as key linkage across the co-op data sets.154,729,44849%
company_descriptionThis is a varchar(2000) field with the company description from online profiles.128,137,47041%
related_domainsThese are sub-domains and top-level domains that roll to up or redirect to the company_domain field in this record.31,937,91010%
social_connectionsThe number of social media connections observed for this person, given in ranges of 1-9, 10-49, 50-99, 100-249, 250-499, 500+.62,559,98320%
dpv_codeThe Delivery Point Confirmation Code is the primary method used by the USPS to indicate whether an address was considered deliverable or undeliverable. Y=confirmed for primary and secondary numbers, D=confirmed for primary number only, secondary number missing, S=confirmed for primary number only, secondary number present but unconfirmed, N=both primary and secondary numbers failed to confirm, Blank=no data.149,038,90348%
company_naicsNorth American Industry Classification System, denoting the primary industry(s) the company is listed in. This field is a comma-separated array with up to 5 NAICS codes.146,010,16047%
company_countryThe ISO Alpha-2 country code of the company address.94,407,70530%
contact_countryThe ISO Alpha-2 country code of the personal address.75,995,88524%
job_title_normalizedThis is the cleaned up version of the job_title, incorporating spelling corrections, standardized title elements, and normalized to reduce the number of variations. E.g. - "CEO", "C.E.O.", "Chief Exec Ofc", etc, all become: "Chief Executive Officer".23,892,9918%
seniority_level_2The updated model for seniority levels, modeled off the job_title and primary_industry. See this page for additional detail: https://docs.5x5coop.com/docs/seniority-and-department-20138,025,19144%
department_2The updated model for departments, modeled off the job_title and primary_industry. See this page for additional detail: https://docs.5x5coop.com/docs/seniority-and-department-20109,927,93735%
about_meThis is an open text field that the person has used to describe themselves as a business introduction.23,209,7337%
also_often_viewedOther social profiles that are viewed with this profile. This is a JSON field, containing the person, and a link to their social profile.70,971,25723%
number_of_jobsNumber of jobs listed in online resumes.75,675,23124%
patentsThis is a JSON field containing a list of all patents listed for this person.271,5760%
time_in_roleThe time in their current listed position, in months.50,014,15116%