cancel
Showing results for 
Search instead for 
Did you mean: 

Segmentation fault with c library - but working fine with smaller

streetster
New Contributor
Hi Guys,

I have a simple function that uses zlib to compress a string. It works fine for strings of length, say, 1 million chars, but fails with strings that are larger than around 8.35 million chars.

"Why am I doing this?" I hear you ask... I want to take the output from .q.csv, and compress it before passing to a (web) client via something like .h.hn... this is working great for small tables (around 8mb worth of csv), but is failing for larger ones.

compress.c

#include <string.h> // for memcpy
#include <zlib.h>   // for compressBound, compress

#include "k.h"

// takes a string, compresses and returns byte array
K1(k_compress)
{
  if (xt != KC)
    return krr((S)"type");

  uLong l = xn;                         // length of input
  uLong cl = compressBound(l);          // compressed length
  unsigned char tmp[cl];                // temporary array

  if(0 == compress(tmp, &cl, kC(x), l)) // compress successful?
  {
    K y = ktn(KG, cl);                  // byte-array of real compressed length
    memcpy(kG(y), &tmp, cl);            // copy compressed data into K object
    R y;                                // return byte array
  }

  R krr((S)"compress");                 // return error to client
}

Makefile

CCOPTS = -fno-builtin -Wall -g -fno-omit-frame-pointer -lz -std=c99 -shared -fPIC -DKXVER=3 -O3 -I../include

all:
        mkdir -p l32
        mkdir -p l64
        gcc $(CCOPTS) zlib.c -o l32/zlib.so -m32
        gcc $(CCOPTS) zlib.c -o l64/zlib.so


compress.q

/ note I have a CHOME environment variable defined
.zlib.compress:(`$getenv[`CHOME],"/lib/zlib/",string[.z.o],"/zlib")2:(`k_compress;1); 
a:.zlib.compress 10000#"hello";
-1 string count a;
b:.zlib.compress 100000#"hello";
-1 string count b;
c:.zlib.compress 1000000#"hello";
-1 string count c;
d:.zlib.compress 10000000#"hello"; / crashes out
-1 string count d; 


Output

44
175
1483
rlwrap: warning: taskset crashed, killed by SIGSEGV.
rlwrap itself has not crashed, but for transparency,
it will now kill itself (without dumping core) with the same signal


I'm using q/kdb+ 3.3, the Segfault occurs with both 32- and 64-bit binaries.

Notes: the reason for the intermediate tmp variable is otherwise we get a K byte array with a bunch of nulls at the end (cl is updated with true compressed length as a side-effect of compress()) .


Any help or advice would be greatly appreciated, so far I'm completely stumped!

/Mark
5 REPLIES 5

charlie
New Contributor II
New Contributor II
cl is too big for stack

I thought I'd tried using malloc with the same result... I'd also managed to put a bug in the memcpy with &tmp instead of tmp. Below is the working code should anyone want to do something similar:

#include <string.h> // for memcpy
#include <stdlib.h> // for malloc
#include <zlib.h>   // for compressBound, compress

#include "k.h"

// takes a string, compresses and returns byte array
K1(k_compress)
{
  if (xt != KC)
    return krr((S)"type");

  uLong l = xn;                         // length of input
  uLong cl = compressBound(l);          // compressed length
  unsigned char *tmp = malloc(cl);      // temporary array

  if(0 == compress(tmp, &cl, kC(x), l)) // compress successful?
  {
    K y = ktn(KG, cl);                  // byte-array of real compressed length
    memcpy(kG(y), tmp, cl);             // copy from temp into k object
    R y;                                // return byte array
  }
  R krr((S)"compress");                 // return error to client
}


Cheers for the pointer!

/Mark

charlie
New Contributor II
New Contributor II
don't forget to free tmp on return

Good point - thanks 🙂

felix1
New Contributor
Intrument the code (-g), force the core dump on crash ( unlim - c unlimited) .
When you have the core use gdb or lldb for the stack trace to pinpoint the error.  

Pe 3 aug. 2017 18:58, "Mark Street" <streetster@gmail.com> a scris:
Hi Guys,

I have a simple function that uses zlib to compress a string. It works fine for strings of length, say, 1 million chars, but fails with strings that are larger than around 8.35 million chars.

"Why am I doing this?" I hear you ask... I want to take the output from .q.csv, and compress it before passing to a (web) client via something like .h.hn... this is working great for small tables (around 8mb worth of csv), but is failing for larger ones.

compress.c

#include <string.h> // for memcpy
#include <zlib.h>   // for compressBound, compress

#include "k.h"

// takes a string, compresses and returns byte array
K1(k_compress)
{
  if (xt != KC)
    return krr((S)"type");

  uLong l = xn;                         // length of input
  uLong cl = compressBound(l);          // compressed length
  unsigned char tmp[cl];                // temporary array

  if(0 == compress(tmp, &cl, kC(x), l)) // compress successful?
  {
    K y = ktn(KG, cl);                  // byte-array of real compressed length
    memcpy(kG(y), &tmp, cl);            // copy compressed data into K object
    R y;                                // return byte array
  }

  R krr((S)"compress");                 // return error to client
}

Makefile

CCOPTS = -fno-builtin -Wall -g -fno-omit-frame-pointer -lz -std=c99 -shared -fPIC -DKXVER=3 -O3 -I../include

all:
        mkdir -p l32
        mkdir -p l64
        gcc $(CCOPTS) zlib.c -o l32/zlib.so -m32
        gcc $(CCOPTS) zlib.c -o l64/zlib.so


compress.q

/ note I have a CHOME environment variable defined
.zlib.compress:(`$getenv[`CHOME],"/lib/zlib/",string[.z.o],"/zlib")2:(`k_compress;1); 
a:.zlib.compress 10000#"hello";
-1 string count a;
b:.zlib.compress 100000#"hello";
-1 string count b;
c:.zlib.compress 1000000#"hello";
-1 string count c;
d:.zlib.compress 10000000#"hello"; / crashes out
-1 string count d; 


Output

44
175
1483
rlwrap: warning: taskset crashed, killed by SIGSEGV.
rlwrap itself has not crashed, but for transparency,
it will now kill itself (without dumping core) with the same signal


I'm using q/kdb+ 3.3, the Segfault occurs with both 32- and 64-bit binaries.

Notes: the reason for the intermediate tmp variable is otherwise we get a K byte array with a bunch of nulls at the end (cl is updated with true compressed length as a side-effect of compress()) .


Any help or advice would be greatly appreciated, so far I'm completely stumped!

/Mark

--
You received this message because you are subscribed to the Google Groups "Kdb+ Personal Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to personal-kdbplus+unsubscribe@googlegroups.com.
To post to this group, send email to personal-kdbplus@googlegroups.com.
Visit this group at https://groups.google.com/group/personal-kdbplus.
For more options, visit https://groups.google.com/d/optout.