Holy crap you really don't want a non device-local ssbo.
Doing some tests around transform data with Vulkan. HOST_CACHED SSBO for matrix data: 42.23 ms / frame. DEVICE_LOCAL: 8.38 ms/frame.
Neither of these is as good as (multiple) dynamic ubos, but still, holy crap that's a difference.
(should note that test scene is the Amazon Bistro scene with no meshes combined)