I am looking for some advice regarding the storage architecture for KDB.
We currently have a physical KDB server attached to a storage array which presents 4 disks to Linux and are then combined into a volume group. KDB accesses the volume group and then stores the data spread across the volume group.
The storage array is going end of life and I am now looking at the best way to present the storage from a NetApp array. We have also built a new virtualized VMWare Server.
Presenting the storage through LUN's and Fibre Channel over Ethernet is not an option in this scenario.
Ideally, I'd like to present multiple NFS volumes from the NetApp array and mount the volumes in Linux. The advantage with multiple volumes in NetApp is that Backing up the volume can be done directly from the NetApp array, rather than going through the ESX layer and then to the virtualized Linux host.
I've been reading about using a par.txt file to combine the Linux NFS volumes so that KDB can simply refence a single database. It appears that data is distributed round robin to each volume.
I've got a few questions about the implementation around the par.txt file and whether this is the right approach.
I am approaching this more from a storage perspective and have limited KDB knowledge.
Any advice around options would be appreciated.
Some inbuilt functions do make assumptions around how data is stored for segmented databases.
The functions assume each date is stored in the segment entry matching modulus of the date by the number of par.txt entries .i.e round robin.
See related thread:
"Partition data correctly: data for a particular date must reside in the partition for that date."
However for querying and normal operations where these functions are not called there is no such requirement.
Symlinking is used often in kdb+ systems for flexibility around storage layouts.
Tel: +44 (0)28 3025 2242
Tel: +1 (212) 447 6700
Tel: +61 (0)2 9236 5700