-
Notifications
You must be signed in to change notification settings - Fork 7
tips_and_troubleshooting
knife cluster ssh bonobo-worker-2 'sudo gem update --system' knife cluster ssh bonobo-worker-2 'sudo true ; for foo in /usr/lib/ruby/gems/1.9.2-p290/specifications/* ; do sudo sed -i.bak "s!000000000Z!!" $foo ; done' knife cluster ssh bonobo-worker-2 'sudo true ; for foo in /usr/lib/ruby/site_ruby/*/rubygems/deprecate.rb ; do sudo sed -i.bak "s!@skip ||= false!true!" $foo ; done'
To set delete_on_termination
to 'true' after the fact, run the following (modify the instance and volume to suit):
ec2-modify-instance-attribute -v i-0704be6c --block-device-mapping /dev/sda1=vol-XX8d2c80::true
If you set disable_api_termination
to true, in order to terminate the node run
ec2-modify-instance-attribute -v i-0704be6c --disable-api-termination false
To view whether an attached volume is deleted when the machine is terminated:
# show volumes that will be deleted
ec2-describe-volumes --filter "attachment.delete-on-termination=true"
You can't (as far as I know) alter the delete-on-termination flag of a running volume. Crazy, huh?
curl http://169.254.169.254/latest/user-data
- Make one volume and format for XFS:
$ sudo mkfs.xfs -f /dev/sdh1
- options "defaults,nouuid,noatime" give good results. The 'nouuid' part prevents errors when mounting multiple volumes from the same snapshot.
- poke a file onto the drive :
datename=
date +%Y%m%d
sudo bash -c "(echo $datename ; df /data/ebs1 ) > /data/ebs1/xfs-created-at-$datename.txt"
If you want to grow the drive:
- take a snapshot.
- make a new volume from it
- mount that, and run
sudo xfs_growfs
. You should have the volume mounted, and should stop anything that would be working the volume hard.
bkupdir=/ebs2/hadoop-nn-backup/date +"%Y%m%d"
for srcdir in /ebs*/hadoop/hdfs/ /home/hadoop/gibbon/hdfs/ ; do destdir=$bkupdir/$srcdir ; echo $destdir ; sudo mkdir -p $destdir ; done
Say you set up an NFS server 'core-homebase-0' (in the 'core' cluster) to host and serve out /home
directory; and a machine 'awesome-webserver-0' (in the 'awesome' cluster), that is an NFS client.
In each case, when the machine was born EC2 created a /home/ubuntu/.ssh/authorized_keys
file listing only the single approved machine keypair -- 'core' for the core cluster, 'awesome' for the awesome cluster.
When chef client runs, however, it mounts the NFS share at /home. This then masks the actual /home directory -- nothing that's on the base directory tree shows up. Which means that after chef runs, the /home/ubuntu/.ssh/authorized_keys file on awesome-webserver-0 is the one for the 'core' cluster, not the 'awesome' cluster.
The solution is to use the cookbook ironfan provides -- it moves the 'ubuntu' user's home directory to an alternative path not masked by the NFS.
For problems starting NFS server on ubuntu maverick systems, read, understand and then run /tmp/fix_nfs_on_maverick_amis.sh -- See "this thread for more":http://fossplanet.com/f10/[ec2ubuntu]-not-starting-nfs-kernel-daemon-no-support-current-kernel-90948/
Suppose you are using the @git@ resource to deploy a recipe (@george@ for sake of example). If @/var/chef/cache/revision_deploys/var/www/george@ exists then nothing will get deployed, even if /var/www/george/{release_sha} is empty or screwy. If git deploy is acting up in any way, nuke that cache from orbit -- it's the only way to be sure.
$ sudo rm -rf /var/www/george/{release_sha} /var/chef/cache/revision_deploys/var/www/george
Your service is probably installed but removed from runit's purview; check the /etc/service
symlink. All of the following should be true:
- directory
/etc/sv/foo
, containing filerun
and dirslog
andsupervise
-
/etc/init.d/foo
is symlinked to/usr/bin/sv
-
/etc/servics/foo
is symlinked tp/etc/sv/foo