I was inspired by the research done by bold. web design on their Fortune 500 palettes site to compile a dataset around company logo/brand/website colors.
This repo contains brand palettes for said companies, palettes extracted from said companies homepages, as well as each company’s logo & a screenshot of their homepage. Do with it what you may (I have some links at the bottom for possible analysis tools).
logo&palette_scraper.py
: hits the bold. site, looping over various industries and scrapes company name/industy/brand palette, and logo locationdownload_logos.py
: downloads the logos from the last bullet into logos/
get_urls.py
: takes the company names and does a quick Google search for their homepage urlstake_screenshots.py
: pops open a headless Chrome browser and screenshots the urls from above. Saves them to a hidden screenshots/
folder. Hidden because ~see next bullet~bulk_resize_images.py
: resizes the screenshots to 512x512 imagesextract_screenshot_colors.py
: takes said screenshots and uses the colorgram package to extract the top 6 colors in the screenshotlogo_colors.csv
(sourced from the bold. site)
company
: company name, hypen separatedcategory
: industrycolor_{1-8}
: contains 1-8 hex codes of brand colors (as determined by bold.)screenshot_colors.csv
(extracted from website screenshots using colorgram)
company
: company name, hypen separatedcolor_{1-6}
: contains 1-6 hex codes of colors in the screenshotcolor_{1-6}_proportion
: proportion of the screenshot that contains said color.
screenshot_location
: where the screenshot is savedcompany_urls.csv
company
: company name, hypen separatedurl
: the company’s homepage to be screenshottedlogo_locations.csv
company
: company name, hypen separatedfile_name
: where the logo is savedurl
: the url of the logo to be downloaded