Skip to content

Latest commit

 

History

History
194 lines (150 loc) · 5.48 KB

File metadata and controls

194 lines (150 loc) · 5.48 KB
comments difficulty edit_url tags
true
中等
数据库

English Version

题目描述

表:cities

+-------------+---------+
| Column Name | Type    | 
+-------------+---------+
| state       | varchar |
| city        | varchar |
+-------------+---------+
(state, city) 是这张表中值互不相同的列的组合。
这张表的每一行包含州名和其中的城市名。

编写一个解决方案来找到 每个州 中的 所有城市 并且根据下列条件分析它们:

  • 逗号分隔 字符串组合每一个州的所有城市。
  • 只显示有 至少 3 个城市的州。
  • 只显示 至少有一个城市 以与 州名相同字母开头 的州。

返回结果表以字母匹配城市的数量 降序 排序,然后按州名称 升序 排序的结果表。

结果格式如下所示。

 

示例:

输入:

cities 表:

+--------------+---------------+
| state        | city          |
+--------------+---------------+
| New York     | New York City |
| New York     | Newark        |
| New York     | Buffalo       |
| New York     | Rochester     |
| California   | San Francisco |
| California   | Sacramento    |
| California   | San Diego     |
| California   | Los Angeles   |
| Texas        | Tyler         |
| Texas        | Temple        |
| Texas        | Taylor        |
| Texas        | Dallas        |
| Pennsylvania | Philadelphia  |
| Pennsylvania | Pittsburgh    |
| Pennsylvania | Pottstown     |
+--------------+---------------+

输出:

+-------------+-------------------------------------------+-----------------------+
| state       | cities                                    | matching_letter_count |
+-------------+-------------------------------------------+-----------------------+
| Pennsylvania| Philadelphia, Pittsburgh, Pottstown       | 3                     |
| Texas       | Dallas, Taylor, Temple, Tyler             | 3                     |
| New York    | Buffalo, Newark, New York City, Rochester | 2                     |
+-------------+-------------------------------------------+-----------------------+

解释:

  • Pennsylvania:
    <ul>
    	<li>有 3 个城市(符合最低条件)</li>
    	<li>所有的 3 个城市都以 'P' 开头(与州相同)</li>
    	<li>matching_letter_count = 3</li>
    </ul>
    </li>
    <li><strong>Texas</strong>:
    <ul>
    	<li>有 4 个城市(符合最低条件)</li>
    	<li>3 个城市 (Taylor, Temple, Tyler) 以 'T' 开头(与州相同)</li>
    	<li>matching_letter_count = 3</li>
    </ul>
    </li>
    <li><strong>New York</strong>:
    <ul>
    	<li>有 4 个城市(符合最低条件)</li>
    	<li>2 个城市 (Newark, New York City) 以 'N' 开头(与州相同)</li>
    	<li>matching_letter_count = 2</li>
    </ul>
    </li>
    <li><strong>California</strong> 没有包含在输出表,因为:
    <ul>
    	<li>尽管它有 4 个城市(符合最低条件)</li>
    	<li>没有城市以 'C' 开头(不符合字母匹配条件)</li>
    </ul>
    </li>
    

注意:

  • 结果以 matching_letter_count 降序排序。
  • 当 matching_letter_count 持平(Texas 和 New York 都为 2),按州名字母序排序。
  • 每一行的城市也以字母序排序。

解法

方法一:分组聚合 + 过滤

我们可以将 cities 表按照 state 字段进行分组聚合,然后对每个分组进行过滤,筛选出满足条件的分组。

MySQL

# Write your MySQL query statement below
SELECT
    state,
    GROUP_CONCAT(city ORDER BY city SEPARATOR ', ') AS cities,
    COUNT(
        CASE
            WHEN LEFT(city, 1) = LEFT(state, 1) THEN 1
        END
    ) AS matching_letter_count
FROM cities
GROUP BY 1
HAVING COUNT(city) >= 3 AND matching_letter_count > 0
ORDER BY 3 DESC, 1;

Pandas

import pandas as pd


def state_city_analysis(cities: pd.DataFrame) -> pd.DataFrame:
    cities["matching_letter"] = cities["city"].str[0] == cities["state"].str[0]

    result = (
        cities.groupby("state")
        .agg(
            cities=("city", lambda x: ", ".join(sorted(x))),
            matching_letter_count=("matching_letter", "sum"),
            city_count=("city", "count"),
        )
        .reset_index()
    )

    result = result[(result["city_count"] >= 3) & (result["matching_letter_count"] > 0)]

    result = result.sort_values(
        by=["matching_letter_count", "state"], ascending=[False, True]
    )

    result = result.drop(columns=["city_count"])

    return result