CLOUDSTACK-10013: Migrate systemvmtemplate to Debian9#2211
Conversation
5f3cb74 to
5844597
Compare
5844597 to
70d84c9
Compare
|
I am trying to review this one, but obviously testing it will be difficult. Can you tell me a bit on what your are still stuck on? Building it with Veewee, but how to proceed from there? Just put it into a 4.10 cloud and test with it? Keep testing and testing? |
|
@wido I got the build system and patching via socket (kvm) work. Currently, I'm stuck on making the various init.d script to work under systemd. I would say build the systemvmtemplate, and deploy a fresh KVM based environment and fix cloud-early-config and other init.d scripts to work with systemd. Next, step would be to do the same for other hypervisors. Feel free to push to this branch on the asf remote as separate commits (don't squash yet). For reference, see the checklist above. |
|
Ok! I will build a 4.10 cloud and try to build the systemVM. Currently stuck with Veewee though. Systemd should be doable, I have used it a lot and have a team of systemd people around me. wido@wido-laptop:~/repos/cloudstack/tools/appliance$ veewee vbox build systemvmtemplate /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require': cannot load such file -- net/scp (LoadError) from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' from /var/lib/gems/2.3.0/gems/veewee-0.4.5.1/lib/veewee/provider/core/helper/ssh.rb:3:in `' from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' from /var/lib/gems/2.3.0/gems/veewee-0.4.5.1/lib/veewee/provider/core/box.rb:2:in `' from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' from /var/lib/gems/2.3.0/gems/veewee-0.4.5.1/lib/veewee/provider/virtualbox/box.rb:1:in `' from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' from /var/lib/gems/2.3.0/gems/veewee-0.4.5.1/lib/veewee/provider/virtualbox/provider.rb:2:in `' from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require' from /var/lib/gems/2.3.0/gems/veewee-0.4.5.1/lib/veewee/providers.rb:14:in `[]' from /var/lib/gems/2.3.0/gems/veewee-0.4.5.1/lib/veewee/environment.rb:225:in `get_box' from /var/lib/gems/2.3.0/gems/veewee-0.4.5.1/lib/veewee/command/vbox.rb:22:in `build' from /var/lib/gems/2.3.0/gems/thor-0.19.4/lib/thor/command.rb:27:in `run' from /var/lib/gems/2.3.0/gems/thor-0.19.4/lib/thor/invocation.rb:126:in `invoke_command' from /var/lib/gems/2.3.0/gems/thor-0.19.4/lib/thor.rb:369:in `dispatch' from /var/lib/gems/2.3.0/gems/thor-0.19.4/lib/thor/invocation.rb:115:in `invoke' from /var/lib/gems/2.3.0/gems/thor-0.19.4/lib/thor.rb:242:in `block in subcommand' from /var/lib/gems/2.3.0/gems/thor-0.19.4/lib/thor/command.rb:27:in `run' from /var/lib/gems/2.3.0/gems/thor-0.19.4/lib/thor/invocation.rb:126:in `invoke_command' from /var/lib/gems/2.3.0/gems/thor-0.19.4/lib/thor.rb:369:in `dispatch' from /var/lib/gems/2.3.0/gems/thor-0.19.4/lib/thor/base.rb:444:in `start' from /var/lib/gems/2.3.0/gems/veewee-0.4.5.1/bin/veewee:24:in `' from /usr/local/bin/veewee:22:in `load' from /usr/local/bin/veewee:22:in `' wido@wido-laptop:~/repos/cloudstack/tools/appliance$ |
|
@wido do a pull --rebase I've pushed latest branch rebased on master. cd to the folder, and do a |
70d84c9 to
6c8fbc9
Compare
|
@wido I think weevee only work with ruby 1.9 so you need to downgrade your ruby version. |
|
I eventually used the build.sh script in tools/appliance and got a img.raw On my existing test cluster I manually overwrite the QCOW2 template on SS and then re-deployed the Secondary Storage VM. It's 'running' now but the Agent isn't starting yet, which is a systemd thing. Looking into that. |
| chkconfig cloud-passwd-srvr off | ||
| chkconfig --add cloud | ||
| chkconfig cloud off | ||
| cat > /lib/systemd/system/cloud-early-config.service << EOF |
There was a problem hiding this comment.
Shouldn't we put these in /etc/systemd/system as they are kind of custom?
There was a problem hiding this comment.
We can do that, but isn't the files in /etc/systemd/system/multi-user.target.wants symlinked to /lib/systemd? I'm open to changes as long as the stuff works.
There was a problem hiding this comment.
No, it isn't. But we'd have to look into that. I'll debug a bit further and push a few commits to the branch without squashing anything
| # The primary network interface | ||
| auto eth0 | ||
| iface eth0 inet dhcp | ||
| pre-up sleep 2 |
There was a problem hiding this comment.
Why do we sleep here? Is that really needed?
There was a problem hiding this comment.
Historic reasons, I've no idea. I simply moved the code from here: https://github.com/apache/cloudstack/blob/master/tools/appliance/definitions/systemvmtemplate/configure_networking.sh#L28
| mkdir -p /home/cloud/.ssh | ||
| chmod 700 /home/cloud/.ssh | ||
| echo "cloud:`openssl rand -base64 32`" | chpasswd | ||
| echo "root:password" | chpasswd |
There was a problem hiding this comment.
Does this force the password for root to 'password'? It seems it does
There was a problem hiding this comment.
Yes, this is ensure all systemvmtemplates have this default credentials (this is what the current systemvmtemplates use as well). In production env, you can set system.vm.random.password to true which then uses system.vm.password to get the password randomly set when mgmt server is initialized.
|
@wido yes, I got it so far as to get serial console ( |
Use a holder class to pass buffers, fixes potential leak. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Fixes timezone issue where dates show up as nvalid in UI - Introduces new event timeline listing/filtering of events - Several UI improvements to add columns in list views - Bulk operations support in instance list view to shutdown and destroy multiple-selected VMs (limitation: after operation, redundant entries may show up in the list view, refreshing VM list view fixes that) - Align table thead/tbody to avoid splitting of tables Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Removes old/dead files - Refactors file path/location, backward compatible to filepaths in systemvm.isoa - Fixes failures around apache2
- Fixes strongswan/ipsec, l2tpd and pppd configs - Uses auto=route in ipsec configs - Fixes road-warrior setup - Fixes site-to-site VPN with automatic connection configuration - Fixes vpc_vpn tests Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This ports PR #1470 by @remibergsma. Make the generated json files unique to prevent concurrency issues: The json files now have UUIDs to prevent them from getting overwritten before they've been executed. Prevents config to be pushed to the wrong router. 2016-02-25 18:32:23,797 DEBUG [c.c.a.t.Request] (AgentManager-Handler-1:null) (logid:) Seq 2-4684025087442026584: Processing: { Ans: , MgmtId: 90520732674657, via: 2, Ver: v1, Flags: 10, [{"com.cloud.agent.api.routing.GroupA nswer":{"results":["null - success: null","null - success: [INFO] update_config.py :: Processing incoming file => vm_dhcp_entry.json.4ea45061-2efb-4467-8eaa-db3d77fb0a7b\n[INFO] Processing JSON file vm_dhcp_entry.json.4ea4506 1-2efb-4467-8eaa-db3d77fb0a7b\n"],"result":true,"wait":0}}] } On the router: 2016-02-25 18:32:23,416 merge.py __moveFile:298 Processed file written to /var/cache/cloud/processed/vm_dhcp_entry.json.4ea45061-2efb-4467-8eaa-db3d77fb0a7b.gz Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Refactors and simplifies systemvm codebase file structures keeping the same resultant systemvm.iso packaging - Password server systemd script and new postinit script that runs before sshd starts - Fixes to keepalived and conntrackd config to make rVRs work again - New /etc/issue featuring ascii based cloudmonkey logo/message and systemvmtemplate version - SystemVM python codebase linted and tested. Added pylint/pep to Travis. - iptables re-application fixes for non-VR systemvms. - SystemVM template build fixes. - Default secondary storage vm service offering boosted to have 2vCPUs and RAM equal to console proxy. - Fixes to several marvin based smoke tests, especially rVR related tests. rVR tests to consider 3*advert_int+skew timeout before status is checked. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- Several systemvmtemplate optimizations - Uses new macchinina template for running smoke tests - Switch to latest Debian 9.3.0 release for systemvmtemplate - Introduce a new `get_test_template` that uses tiny test template such as macchinina as defined test_data.py - rVR related fixes and improvements Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
On XenServer, both redundant router's vifs were getting deleted when any PF rule is removed from any of the acquired public IPs. This fix ensures that lastIp is set to `false` when processed by hypervisor resources to avoid removing of VIFs when VPCs have any source nat IP. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This adds the `net-tools` dependency on CentOS cloudstack-agent rpms. This will provide ifconfig, route and other tools that may be used by CloudStack scripts and utilities. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
Resize for VMware root disk should only be performed during VM start when vmware.create.full.clone is true i.e. the disk chain length is one. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes test failures around VMware with the new systemvmtemplate. In addition: - Does not skip rVR related test cases for VMware - Removes rc.local - Processes unprocessed cmd_line.json - Fixed NPEs around VMware tests/code - On VMware, use udevadm to reconfigure nic/mac address than rebooting - Fix proper acpi shutdown script for faster systemvm shutdowns - Give at least 256MB of swap for VRs to avoid OOM on VMware - Fixes smoke tests for environment related failures Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
In default/fresh installations, the guest os type for systemvms with id=15 or Debian 5 (32-bit) can cause memory allocation issues to guest. Using Other Linux 64-bit as guest OS systemvms get all the allocated RAM. This avoids OOM related kernel panics for certain VRs such as rVRs, lbvm etc. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
- This migrates the current systemvmtemplate build system from veewee/virtualbox to packer and qemu based. - This also introduces and updates a CentOS7 built-in template. - Remove old appliance build scripts and files. - Adds iftop package (CLOUDSTACK-9785) Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This moves the systevmtemplate migration logic from previous upgrade path to 4.10.0.0->4.11.0.0 upgrade path. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
This fixes incorrect total host memory in listHosts and related host responses, regression introduced in #2120. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
In a VMware 55u3 environment it was found that CPVM and SSVM would get the same public IP. After another investigative review of fetchNewPublicIp method, it was found that it would always pick up the first IP from the sql query list/result. The cause was found to be that with the new changes no table/row locks are done and first item is used without looping through the list of available free IPs. The previously implementation method that put IP address in allocating state did not check that it was a free IP. In this refactoring/fix, the first free IP is first marked as allocating and if assign is requested that is changed into Allocated state. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
|
Packaging result: ✔centos6 ✔centos7 ✔debian. JID-1466 |
This includes test related fixes and code review fixes based on reviews from @rafaelweingartner, @marcaurele, @wido and @DaanHoogland. This also includes VMware disk-resize limitation bug fix based on comments from @sateesh-chodapuneedi and @priyankparihar. This also includes the final changes to systemvmtemplate and fixes to code based on issues found via test failures. Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
| final IPAddressVO userIp = _ipAddressDao.findById(addr.getId()); | ||
| if (userIp.getState() == IpAddress.State.Free) { | ||
| addr.setState(IpAddress.State.Allocating); | ||
| if (_ipAddressDao.update(addr.getId(), addr)) { |
There was a problem hiding this comment.
@rhtyd unlike previous approach, now it seems to be doing an extra update operation to DB when assign is true, can we avoid that?
There was a problem hiding this comment.
The current regression fix is a two step process where IPs are first marked allocating and then marked allocated if assign is true, it's because ips in both free and allocating state can be allocated. Do you propose we simply mark free ips to allocated in the loop if assign is true? @yvsubhash if you can come up with further improvements/enhancements, please send a new PR, thanks.
|
Trillian test result (tid-1882)
✅ 🎉 all pass! |
|
@rhtyd ^^ 0 failures, haven't seen this for a while :) 🥇 |
|
Trillian test result (tid-1886)
✅ 🎉 all pass! |
|
Trillian test result (tid-1890)
When the agent keystore is setup for cpvm/ssvm it restarts the Re-ran tests and it passed: ✅ 🎉 all pass! |
|
Trillian test result (tid-1889)
✅ 🎉 all pass! |
|
Thanks @borisstoyanov :) I'll merge the PR now and post-merge I'll kick smoke and component tests. Some component tests have been fixed in #2344 and will work with @borisstoyanov and others to further stabilize tests and master. |
|
Trillian test result (tid-1904)
|
|
Seems like we hit https://issues.apache.org/jira/browse/CLOUDSTACK-9749 again. |
|
@fmaximus thanks for sharing, can you explore a fix and perhaps help submit a PR on this basing the fix in the python script than the shell/setup scripts: |
Outstanding tasks (feel free to add more):
2GB2.4GB from 3.2GBcmdlineconsumptionPinging for review and additional help with changes @wido @swill and others
This branch is pushed on ASF remote and allows for all committers to contribute changes directly, feel free to push fixes as separate commits, thanks.
Systemvmtemplate built from this branch are available here for testing: http://hydra.yadav.xyz/debian9/ (temporarily hosted until this branch is merged).
New SystemVM prompt showcasing the systemvmtemplate version and

cloudmonkeyascii art: