Skip to content
This repository has been archived by the owner on Mar 24, 2022. It is now read-only.

Pyspark tutorial? #18

Open
zhang2jg opened this issue Mar 22, 2017 · 4 comments
Open

Pyspark tutorial? #18

zhang2jg opened this issue Mar 22, 2017 · 4 comments

Comments

@zhang2jg
Copy link

zhang2jg commented Mar 22, 2017

Hi, I have gone through the tutorial and would like to try pyspark on hdfs. I notice pyspark is pre-installed (2.0.x). But it doesn't support the pre-installed python (version 2.6.6). To make it work, need to upgrade to newer version of python. I tried pip, yum to update python. But these commands are not recognized in the VM.

@danielgustafsson
Copy link
Contributor

@skahler-pivotal do you know if thats doable in the image?

@skahler-vmware
Copy link
Contributor

Not sure if that is doable in the image. pyspark must be coming down as part of Zepplin, but I imagine the fact that the system is still running RHEL6 is causing an issue.

I'll take this as another flag that the process needs to be upgraded to use RHEL7

There are a couple tutorials on upgrading or adding python2.7 that aren't that tough, but yum and some of the system stuff relies on 2.6 at the base. So you've got potential to mess that up. It is a VM though so probably not much lost if it does go sideways.

@zhang2jg
Copy link
Author

Thanks! I figure I can use yum when login as root. Previously I login as gpadmin and couldn't use yum

@skahler-vmware
Copy link
Contributor

gpadmin should be able to sudo to root in the VM

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants