Stable/mitaka#8
Open
ykmmxm wants to merge 8664 commits intoturbonomic:stable/mitakafrom
ykmmxm:stable/mitaka
Open
Stable/mitaka#8ykmmxm wants to merge 8664 commits intoturbonomic:stable/mitakafrom ykmmxm:stable/mitaka
ykmmxm wants to merge 8664 commits intoturbonomic:stable/mitakafrom
ykmmxm:stable/mitaka
Conversation
Starting with the Pike release, reporting VCPU/memory/disk is no longer required. However, we used VCPU to check if a node is available, so nodes without VCPU in their properties were always ignored. This patch changes the logic to use the existing _node_resources_unavailable call. This change also fixes another related issue: when disk or memory are missing from properties, the virt driver tries to report zero max_unit for them, which is not allowed by placement. Change-Id: I1bbfc152189252c5c45e6153695a802d17b76690 Closes-Bug: #1723423 (cherry picked from commit b25928d)
…st allocs" into stable/pike
…hitelist" into stable/pike
…elist" into stable/pike
OSError will lead instance to ERROR state, change to MigrationPreCheckError will make the instance status not changed. Also, modify some test cases to make unit test easier Closes-Bug: 1694636 Change-Id: I3286c32ca205ffd2d5d1aaab88cc96699476e410 (cherry picked from commit cb565d9)
The BDM has no uuid attribute so the debug message in here would result in an AttributeError. This has been around since the creation of this object, and the debug log message was probably copied from the Instance object. This was only exposed in Pike when this code started lazy-loading the instance field: I1dc54a38f02bb48921bcbc4c2fdcc2c946e783c1 So this change fixes that bug and adds tests for obj_load_attr. Change-Id: I8b55227b1530a76c2f396c035384abd89237d936 Closes-Bug: #1726871 (cherry picked from commit 1ca191f)
Replace the ocata config-reference URLs with URLs in each project repo. Change-Id: I48d7c77a6e0eaaf0efe66f848f45ae99007577e1 Closes-Bug: #1715545 (cherry picked from commit 2fce8a1)
As part of the docs migration from openstack-manuals to nova in the pike release we missed the config-drive docs. This change does the following: 1. Imports the config-drive doc into the user guide. 2. Fixes a broken link to the metadata service in the doc. 3. Removes a note about liberty being the current release. 4. Adds a link in the API reference parameters to actually point at the document we have in tree now, which is otherwise not very discoverable as the main index does not link to this page (or the user index for that matter). Partial-Bug: #1714017 Closes-Bug: #1720873 Change-Id: I1d54e1f5a1a94e9821efad99b7fa430bd8fece0a (cherry picked from commit 59bd2f6)
This imports the "provide-user-data-to-instances" page from the old openstack-manuals user guide. Since we don't have a glossary, the :term: link is removed and replaced with just giving the glossary definition as the first part of the doc. Change-Id: Iae70d9b53d6cefb3bcb107fe68499cccb71fc15e Partial-Bug: #1714017 (cherry picked from commit 3fc8538)
One of the things this commit:
commit 14c38ac
Author: Kashyap Chamarthy <kchamart@redhat.com>
Date: Thu Jul 20 19:01:23 2017 +0200
libvirt: Post-migration, set cache value for Cinder volume(s)
[...]
did was to supposedly remove "duplicate" calls to _set_cache_mode().
But that came back to bite us.
Now, while the Cinder volumes are taken care of w.r.t handling its cache
value during migration, but the above referred commit (14c38ac) seemed
to introduce a regression because it disregards the 'disk_cachemodes'
Nova config parameter altogether for boot disks -- i.e. even though if
a user set the cache mode to be 'writeback', it's ignored and
instead 'none' is set unconditionally.
Add the _set_cache_mode() calls back in _get_guest_storage_config().
Co-Authored-By: melanie witt <melwittt@gmail.com>
Closes-Bug: #1727558
Change-Id: I7370cc2942a6c8c51ab5355b50a9e5666cca042e
(cherry picked from commit 24e79bc)
In I6ddcaaca37fc5387c2d2e9f51c67ea9e85acb5c5 we forgot to update the legacy filter properties dictionary so the requested target wasn't passed to the scheduler when evacuating. Adding a functional test for verifying the behaviour. NOTE(sbauza): The issue has been incendentally fixed in Pike by I434af8e4ad991ac114dd67d66797a562d16bafe2 so the regression test just verifies that the expected behaviour works. The Newton and Ocata backports will be slightly different from that one as we need to verify that host3 will be preferred eventually over host2. Related-Bug: #1702454 Change-Id: Id9adb10d2ef821c8b61d8f1d5dc9dd66ec7aaac8 (cherry picked from commit e0e2e065a495b4fa9ebdec987c935e3c83118c46)
When we added the requested_destination field for the RequestSpec object in Newton, we forgot to pass it to the legacy dictionary when wanting to use scheduler methods not yet supporting the NovaObject. As a consequence, when we were transforming the RequestSpec object into a tuple of (request_spec, filter_props) dicts and then rehydrating a new RequestSpec object using those dicts, the newly created object was not keeping that requested_destination field from the original. Change-Id: Iba0b88172e9a3bfd4f216dd364d70f7e01c60ee2 Closes-Bug: #1702454 (cherry picked from commit 69bef428bd555bb31f43db6ca9c21db8aeb9007e)
…tance If we're calling build_request_spec in conductor.rebuild_instance, it's because we are evacuating and the instance is so old it does not have a request spec. We need the request_spec to pass to the scheduler to pick a destination host for the evacuation. For evacuate, nova-api does not pass any image reference parameters, and even if it did, those are image IDs, not an image meta dict that build_request_spec expects, so this code has just always been wrong. This change fixes the problem by passing a primitive version of the instance.image_meta which build_request_spec will then return back to conductor and that gets used to build a RequestSpec object from primitives. It's important to use the correct image meta so that the scheduler can properly filter hosts using things like the AggregateImagePropertiesIsolation and ImagePropertiesFilter filters. Change-Id: I0c8ce65016287de7be921c312493667a8c7f762e Closes-Bug: #1727855 (cherry picked from commit d2690d6)
The bandwidth param set outside of the method "migrate" from guest object have to be done inside that to avoid duplicating that option. (cherry picked from commit c212ad2) Backported to avoid a minor merge conflict backporting change I9b545ca8, and because it addresses a related issue calling migrateToURI3. Change-Id: I8a37753dea8eca7b26466f17dfbdc184c48c24c5 Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@redhat.com>
If we specify block migration, but there are no disks which actually require block migration we call libvirt's migrateToURI3() with VIR_MIGRATE_NON_SHARED_INC in flags and an empty migrate_disks in params. Libvirt interprets this to be the default block migration behaviour of "block migrate all writeable disks". However, migrate_disks may only be empty because we filtered attached volumes out of it, in which case libvirt will block migrate attached volumes. This is a data corruptor. This change addresses the issue at the point we call migrateToURI3(). As we never want the default block migration behaviour, we can safely remove the flag if the list of disks to migrate is empty. (cherry picked from commit ea9bf52) nova/tests/unit/virt/libvirt/test_driver.py: Explicitly asserts byte string destination_xml in _test_live_migration_block_migration_flags. Not required in master due to change I85cd9a90. Change-Id: I9b545ca8aa6dd7b41ddea2d333190c9fbed19bc1 Resolves-bug: #1719362
When confirming a resize, the libvirt driver on the source host checks to see if the instance base directory (which contains the domain xml files, etc) exists and if the root disk image does not, it removes the instance base directory. However, the root image disk won't exist on local storage for a volume-backed instance and if the instance base directory is on shared storage, e.g. NFS or Ceph, between the source and destination host, the instance base directory is incorrectly deleted. This adds a check to see if the instance is volume-backed when checking to see if the instance base directory should be removed from the source host when confirming a resize. Change-Id: I29fac80d08baf64bf69e54cf673e55123174de2a Closes-Bug: #1728603 (cherry picked from commit f02afc6)
When we notice that an instance was deleted after scheduling, we punt on instance creation. When that happens, the scheduler will have created allocations already so we need to delete those to avoid leaking resources. Related-Bug: #1679750 Change-Id: I54806fe43257528fbec7d44c841ee4abb14c9dff (cherry picked from commit 57a3af6)
The resource tracker's _remove_deleted_instances_allocations() assumes that
InstanceNotFound means that an instance was deleted. That's not quite accurate,
as we would also see that in the window between creating allocations and actually
creating the instance in the cell database. So, the code now will kill
allocations for those instances before they are created.
This change makes us look up the instance with read_deleted=yes, and if we find
it with deleted=True, then we do the allocation removal. This does mean that
someone running a full DB archive at the instant an instance is deleted in some
way that didn't result in allocation removal as well could leak those. However,
we can log that (unlikely) situation.
Closes-Bug: #1729371
Conflicts:
nova/compute/resource_tracker.py
nova/tests/unit/compute/test_resource_tracker.py
NOTE(mriedem): Conflicts were due to not having change
1ff1310 or change
e3b7f43 in Pike.
Change-Id: I4482ac2ecf8e07c197fd24c520b7f11fd5a10945
(cherry picked from commit d176175)
…ize" into stable/pike
The hide_server_addresses extension is looking up the cached instance based on what the user provided for the server id, which may not match what is used to cache the instance for the request. For example, a request with upper-case server uuid could be found in a mysql-backed system because mysql is case insensitive by default, but the instance is keyed off the server id from the DB, which is lower-case, so we'll fail to look up the instance in the cache if the IDs don't match. There is no test for this because it turns out it's actually really hard to recreate this since it requires running with a mysql backend to recreate the case insensitive check, which isn't going to work with sqlite. Given how trivial this fix is, creating a big mysql recreate test is not worth it. Change-Id: I09b288aa2ad9969800a3cd26c675b002c6c9f638 Closes-Bug: #1693335 (cherry picked from commit ecfb65c)
…"" into stable/pike
When a server build fails on a selected compute host, the compute service will cast to conductor which calls the scheduler to select another host to attempt the build if retries are not exhausted. With commit 08d24b7, if retries are exhausted or the scheduler raises NoValidHost, conductor will deallocate networking for the instance. In the case of neutron, this means unbinding any ports that the user provided with the server create request and deleting any ports that nova-compute created during the allocate_for_instance() operation during server build. When an instance is deleted, it's networking is deallocated in the same way - unbind pre-existing ports, delete ports that nova created. The problem is when rescheduling from a failed host, if we successfully reschedule and build on a secondary host, any ports created from the original host are not cleaned up until the instance is deleted. For Ironic or SR-IOV ports, those are always deallocated. The ComputeDriver.deallocate_networks_on_reschedule() method defaults to False just so that the Ironic driver could override it, but really we should always cleanup neutron ports before rescheduling. Looking over bug report history, there are some mentions of different networking backends handling reschedules with multiple ports differently, in that sometimes it works and sometimes it fails. Regardless of the networking backend, however, we are at worst taking up port quota for the tenant for ports that will not be bound to whatever host the instance ends up on. There could also be legacy reasons for this behavior with nova-network, so that is side-stepped here by just restricting this check to whether or not neutron is being used. When we eventually remove nova-network we can then also remove the deallocate_networks_on_reschedule() method and SR-IOV check. NOTE(mriedem): There are a couple of changes to the unit test for code that didn't exist in Pike, due to the change for alternate hosts Iae904afb6cb4fcea8bb27741d774ffbe986a5fb4 and the change to pass the request spec to conductor Ie5233bd481013413f12e55201588d37a9688ae78. Change-Id: Ib2abf73166598ff14fce4e935efe15eeea0d4f7d Closes-Bug: #1597596 (cherry picked from commit 3a503a8) (cherry picked from commit 9203326)
The _make_instance_list method is used to make an InstanceList object out of database dict-like instance objects. It's possible while making the list that the various _from_db_object methods that are called might do their own database writes. Currently, we're calling _make_instance_list nested inside of a 'reader' database transaction context and we hit the error: TypeError: Can't upgrade a READER transaction to a WRITER mid-transaction during the _make_instance_list call if anything tries to do a database write. The scenario encountered was after an upgrade to Pike, older service records without UUIDs were attempted to be updated with UUIDs upon access, and that access happened to be during an instance list, so it failed when trying to write the service UUID while nested inside the 'reader' database transaction context. This simply moves the _make_instance_list method call out from the @db.select_db_reader_mode decorated _get_by_filters_impl method to the get_by_filters method to remove the nesting. Closes-Bug: #1746509 Change-Id: Ifadf408802cc15eb9769d2dc1fc920426bb7fc20 (cherry picked from commit b1ed92c) (cherry picked from commit 22b2a8e)
As of now, if vm task_state is not 'None', and user tries to force-delete instance, then he gets HTTP 500 Error and instance deletion doesn't progress. The same is not the case, when user tries with delete api instead of force-delete api, even if vm task_state is not 'None'. Fixed the issue by allowing force-delete to delete instance in task_state other than None. Change-Id: Ida1a9d8761cec9585f031ec25e5692b8bb55661e Closes-Bug: #1741000 (cherry picked from commit 0d2031a)
…t" into stable/pike
There are some cases where None value is set to cpuset_reserved in InstanceNUMATopology at _numa_fit_instance_cell() function in hardware.py. However, libvirt driver treat cpuset_reserved value as an iterate object when it constructs xml configuration. To avoid a risk to get an error in libvirt driver, this patch adds a check to see if the value is not None before adding the cpus for emulator threads. Change-Id: Iab3d950c4f4138118ac6a9fd98407eaadcb24d9e Closes-Bug: #1746674 (cherry picked from commit 24d9e06) (cherry picked from commit 2dc4d7a)
Change I11746d1ea996a0f18b7c54b4c9c21df58cc4714b changed the
behavior of the API and conductor when rebuilding an instance
with a new image such that the image is run through the scheduler
filters again to see if it will work on the existing host that
the instance is running on.
As a result, conductor started passing 'scheduled_node' to the
compute which was using it for logic to tell if a claim should be
attempted. We don't need to do a claim for a rebuild since we're
on the same host.
This removes the scheduled_node logic from the claim code, as we
should only ever attempt a claim if we're evacuating, which we
can determine based on the 'recreate' parameter.
Conflicts:
nova/compute/manager.py
NOTE(mriedem): The conflict is due to change
I0883c2ba1989c5d5a46e23bcbcda53598707bcbc in Queens.
Change-Id: I7fde8ce9dea16679e76b0cb2db1427aeeec0c222
Closes-Bug: #1750618
(cherry picked from commit a390290)
(cherry picked from commit 3c5e519)
…led" into stable/pike
…n" into stable/pike
…guration complexity
viveknandavanam
requested changes
Apr 12, 2018
| under the [DEFAULT] section | ||
| ------------------------------------------------------------ | ||
| driver = nova.scheduler.turbonomic_scheduler.TurbonomicScheduler | ||
| scheduler_driver = nova.scheduler.turbonomic_scheduler.TurbonomicScheduler |
There was a problem hiding this comment.
This wasnt changed in the Mitaka branch - Why is this showing up as a change?
This exists only on the Pike branch - Did you create the patch on the right branch?
| 2) Add turbonomic_driver to <Python 2.7>/site-packages/nova-16.1.0-py2.7.egg-info/entry_points.txt: | ||
| turbonomic_scheduler = nova.scheduler.turbonomic_scheduler:TurbonomicScheduler | ||
| 2) scheduler_driver should be enabled across all regions, turbonomic_target_address must be equal to the address specified | ||
| by the customer while discovering the target, e.x. a target consists of RegionOne (X.X.X.10) and RegionTwo (X.X.X.11) |
There was a problem hiding this comment.
while discovering the target -> while discovering the target in Turbonomic
|
Can you also work on resolving the conflicts mentioned above? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Added turbonomic_target_address parameter to mitigate OpenStack configuration complexity