-
Notifications
You must be signed in to change notification settings - Fork 3
Useful SQL Queries
Peter Mangiafico edited this page Mar 4, 2019
·
4 revisions
Dump identifiers for approved publications to a CSV file:
DOIs:
mysql -e "select identifier_value from publication_identifiers pi inner join contributions c on pi.publication_id = c.publication_id where pi.identifier_type='doi' and pi.certainty='confirmed' and c.status='approved'" -u pub -p sul-pub > 'stanford-dois.csv'
Total approved publications:
select count(distinct(publication_id)) from contributions where status='approved';
Recent contributions for publications with a WoS UID and a PubMed ID:
select * from contributions, publications
where publications.pmid is not null and
publications.wos_uid is not null and
contributions.publication_id = publications.id
order by publications.created_at desc, contributions.author_id asc
limit 30
Recent contributions for publications with a WosUID and no PubMed ID:
select * from contributions, publications
where publications.pmid is null and
publications.wos_uid is not null and
contributions.publication_id = publications.id
order by publications.created_at desc, contributions.author_id asc
limit 30
Find authors with a lot of institution data by looking for how many start_date
values they have:
SELECT *, count(*) as count
FROM author_identities
WHERE start_date is not NULL
GROUP BY author_id
ORDER BY count desc
To find some common last names:
SELECT preferred_last_name, count(preferred_last_name) as 'num'
FROM authors
WHERE preferred_last_name is not null
GROUP BY preferred_last_name
ORDER BY count(preferred_last_name) desc
LIMIT 100
To find authors with a common last name who have many approved contributions:
SELECT a.preferred_first_name, a.preferred_middle_name, a.preferred_last_name, count(c.id) as 'seed_count'
FROM authors as a LEFT JOIN contributions as c
ON a.id=c.author_id
WHERE a.preferred_last_name = 'Wang'
AND c.status = 'approved'
GROUP BY a.id
ORDER BY count(c.id) desc
LIMIT 100