SOGo | BTS

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0003446SOGoSOPEpublic2016-01-19 12:072018-06-12 11:55
Reporterlpouzenc 
Assigned Toludovic 
PrioritynormalSeverityminorReproducibilityalways
StatusclosedResolutionfixed 
Platform[Server] LinuxOSDebianOS Version8 (Jessie)
Product Version2.2.9 
Target VersionFixed in Version4.0.1 
Summary0003446: Stuck processses sogod, high CPU, no syscall : infinite loop while reading SSL socket
DescriptionHi,

We have apt-get install sogo from Debian Jessie, make some conf with LDAP + CAS, put into production and we see sogod process from the pool at high CPU usage. This happens many times per hour with 3500 potential users (but not much parallel requests seen in logs).

On those process, strace -p <pid> says : no syscalls.

gdbserver / ddd says : infinite loop in sope-2.2.9/sope-core/NGStreams/NGByteBuffer.m line 247, in a curiously named function la() with some "//TODO" in it, copyrighted from 2000 to 2005.

The infinite loop, with some ellipsis :

readStram=YES;
while (readStream) {
  desiredBytes = 738;
  cntReadBytes = self->readBytes(... ,desiredBytes) // readBytes always return 0
  if (cntReadBytes == NGStreamError) {
    break; // Never reached
  } else {
    if (cntReadBytes == desiredBytes)
      readStream = NO; // Never reached
  } else {
     while (cntReadBytes > 0) {
       // [...] // Never reached
     }
  }
}

Function readBytes(...) calls nearly directly GNUTLS :

(unsigned)readBytes:(void *)_buf count:(unsigned)_len {
  // [...]
  ret = gnutls_record_recv((gnutls_session_t) self->session, _buf, _len);
  if (ret < 0)
    return NGStreamError;
  else
    return ret;
}
Steps To ReproduceI'm unsure about the about the particular context that triggers gnutls_record_recv() to constantly return 0, but when the context implies that, then the infinite loop in NGByteBuffer.m is always reproductible.

It may a lost contact with a user using WiFi or so ?



Additional Informationgnutls may have sligthy changed behavior across versions about null reads / EOF ?

man page gnutls_record_recv from Jessie's gnutls-doc (3.3.8-6+deb8u3)
Some man page version says :

RETURNS
       The number of bytes received and zero on EOF (for stream connections). A negative error code is returned in case of an error. The number of bytes received might be less than the requested data_size.
TagsNo tags attached.
Attached Filespatch file icon fix-sogo-infinite-loops.patch [^] (573 bytes) 2016-01-21 05:45 [Show Content]

- Relationships Relation Graph ] Dependency Graph ]

-  Notes
(0009308)
lpouzenc (reporter)
2016-01-21 05:53

Please find an attached patch that erradicates the stuck processes on my production setup. It includes "read after EOF" case as NGStreamError.

No side effects found last 2 days but consider it as highly experimental.

Hoping for the best,
Ludovic
User avatar (0009944)
ludovic (administrator)
2016-04-08 12:50

I don't think it's the right fix.

gnutls_record_recv() returns 0 because EOF is reached, not because there's an error (< 0). So that code leads to unwanted code paths.
User avatar (0012915)
ludovic (administrator)
2018-06-12 11:55

I fixed a similar issue in SOPE for OpenSSL (https://github.com/inverse-inc/sope/commit/2f26952009f622f97a43921a6cfdafb79b8f46f6 [^]) for which SSL_read clearly has special error-meaning when it reads 0.

- Issue History
Date Modified Username Field Change
2016-01-19 12:07 lpouzenc New Issue
2016-01-21 05:45 lpouzenc File Added: fix-sogo-infinite-loops.patch
2016-01-21 05:53 lpouzenc Note Added: 0009308
2016-04-08 12:50 ludovic Note Added: 0009944
2016-04-08 12:50 ludovic Severity major => minor
2018-06-12 11:55 ludovic Note Added: 0012915
2018-06-12 11:55 ludovic Status new => closed
2018-06-12 11:55 ludovic Assigned To => ludovic
2018-06-12 11:55 ludovic Resolution open => fixed
2018-06-12 11:55 ludovic Fixed in Version => 4.0.1


Copyright © 2000 - 2018 MantisBT Team
Powered by Mantis Bugtracker