Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

一个大表在oracle,一个大表在mysql,2个表需要join的场景,全部拉回到spark对数据库的压力太大了。 这种场景是不是不适合moonbox? #80

Open
mixhuhu opened this issue Aug 14, 2019 · 1 comment

Comments

@mixhuhu
Copy link

mixhuhu commented Aug 14, 2019

No description provided.

@fumiwork
Copy link

首先,如果除了join之外还有其他可以下推算子,moonbox是可以支持算子下推的,比如先聚合或过滤后再join,这时moonbox不会全表数据拉回到spark;其次,如果真是需求就是两个大表直接join,那全部拉回spark去做join是不可避免的,如果担心是在数据库压力这块(主要是IO),moonbox可以限制数据库拉回到spark的并行度

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants