Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1516583 - httpd crashed when many restapi request
Summary: httpd crashed when many restapi request
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Frontend.WebAdmin
Version: future
Hardware: x86_64
OS: Linux
unspecified
high vote
Target Milestone: ---
: ---
Assignee: bugs@ovirt.org
QA Contact: Pavel Stehlik
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-23 00:56 UTC by seesky
Modified: 2017-12-11 01:44 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-12-11 01:44:36 UTC
oVirt Team: Infra


Attachments (Terms of Use)
httpd error_log (deleted)
2017-11-23 00:56 UTC, seesky
no flags Details
server.log (deleted)
2017-11-23 01:12 UTC, seesky
no flags Details
engine.log-20171124.7z.001 (deleted)
2017-11-24 00:26 UTC, seesky
no flags Details
engine.log-20171124.7z.002 (deleted)
2017-11-24 00:27 UTC, seesky
no flags Details
engine.log-20171124.7z.003 (deleted)
2017-11-24 00:29 UTC, seesky
no flags Details
engine.log-20171124.7z.004 (deleted)
2017-11-24 00:30 UTC, seesky
no flags Details
engine.log-20171124.7z.005 (deleted)
2017-11-24 00:31 UTC, seesky
no flags Details
engine.log-20171124.7z.006 (deleted)
2017-11-24 00:33 UTC, seesky
no flags Details

Description seesky 2017-11-23 00:56:17 UTC
Created attachment 1357946 [details]
httpd error_log

Description of problem:
    Sorry for my english...
    I have a ovirt environment, 2 server(E5-2678V3*2、DDR4 256G、Intel PCIE SSD 3520), 100 windows7 (windows7 X86、3G RAM)desktop user every server.
    ovirt-engine install host1 with vdsm.
    Httpd will crash when many request,about 500 RESTFUL request every 5 seconds(The thin client software will refresh the vm status every 5 seconds).I should restart the httpd service.
    I can't find some error log of httpd about this error.I think it's not a problem of server performance(30% cpu、70% RAM used).

    I think it's ovirt-engine module error in httpd, not httpd error...

Version-Release number of selected component (if applicable):


How reproducible:

A large amount of access to restful api

Steps to Reproduce:
1.install ovirt-engine and vdsm on same server
2.create 100 desktop user
3.repetitive execution login/get vm/refresh vm status
4.httpd crashed,should be restart

Actual results:
httpd crashed,vm running normal,the server is full of surplus resources

Expected results:


Additional info:

Comment 1 seesky 2017-11-23 01:12:52 UTC
Created attachment 1357949 [details]
server.log

Comment 2 Yaniv Kaul 2017-11-23 07:54:55 UTC
It is quite unconventional to set up Engine and VDSM on the same server.
On that same server you are also running many VMs?

What version of oVirt are you running? What version of the API are you using? What version of CentOS? Can you share the REST API calls? 

Are you using persistent auth.?

Comment 3 seesky 2017-11-23 14:37:22 UTC
(In reply to Yaniv Kaul from comment #2)
> It is quite unconventional to set up Engine and VDSM on the same server.
> On that same server you are also running many VMs?
> 
> What version of oVirt are you running? What version of the API are you
> using? What version of CentOS? Can you share the REST API calls? 
> 
> Are you using persistent auth.?

(1)There is 50 windows 7 vm on the engine server...
(2)oVirt engine version:4.1.5.2-1.el7.centos   
(3)Centos7 System
(4)I write a application for thin client with QT and c++
1.The application will get a token string from sso restful
2.Get vm list from restful api(usually only one vm)
3.Start a new thread to refresh the vm status every 5 second(for example:running,stop,restart..). The status refresh request will repeat even if the token become invalid until the application quit.(I know it's a bug..It hasn't changed yet)
4.conclusion:  100 user * 1 vm * 1 status refresh request = 100 restful api request in 1 second. I don't think apache will crash  because of this request. I use top tool to get status of cpu and memery,only 35% cpu uesd and 70% memery used.Resourced.
5.Don't need restart ovirt-engine service,just restart httpd,the restful service and admin web will recovery.I not clear about the relationship between httpd and wildfly,perhaps it's just because I run too many vm on this server...

Comment 4 Yaniv Kaul 2017-11-23 15:11:16 UTC
Can you also share engine.log?

Comment 5 seesky 2017-11-24 00:26:42 UTC
Created attachment 1358455 [details]
engine.log-20171124.7z.001

Comment 6 seesky 2017-11-24 00:27:54 UTC
Created attachment 1358456 [details]
engine.log-20171124.7z.002

Comment 7 seesky 2017-11-24 00:29:19 UTC
Created attachment 1358457 [details]
engine.log-20171124.7z.003

Comment 8 seesky 2017-11-24 00:30:55 UTC
Created attachment 1358458 [details]
engine.log-20171124.7z.004

Comment 9 seesky 2017-11-24 00:31:47 UTC
Created attachment 1358459 [details]
engine.log-20171124.7z.005

Comment 10 seesky 2017-11-24 00:33:51 UTC
Created attachment 1358460 [details]
engine.log-20171124.7z.006

Comment 11 seesky 2017-11-24 00:51:31 UTC
(In reply to Yaniv Kaul from comment #4)
> Can you also share engine.log?

  All errors is:
2017-11-23 20:59:58,935+08 ERROR [org.ovirt.engine.core.sso.utils.SsoUtils] (default task-159) [] OAuthException invalid_grant: The provided authorization grant for the auth code has expired.
2017-11-23 20:59:58,937+08 ERROR [org.ovirt.engine.core.aaa.filters.SsoRestApiAuthFilter] (default task-444) [] Cannot authenticate using authentication Headers: invalid_grant: The provided authorization grant for the auth code has expired.


ThinClient Application Request Code:
1.login button click
void MainWindow::on_loginPushButton_clicked()
{

    HttpRequestInput *input = new HttpRequestInput("https://" + Utils::getServerIPAddress() + "/ovirt-engine/sso/oauth/token", "POST");
    input->add_var("grant_type", "password");
    input->add_var("scope", "ovirt-app-api");
    QString username = ui->usernamelineEdit->text().trimmed();
    QString password = ui->passwordlineEdit->text().trimmed();
    //send username and password to sso service
    input->add_var("username", username + "@internal");
    input->add_var("password", password);
   
    input->headers.insert("Accept", "application/json");

    HttpRequestWorker *worker = new HttpRequestWorker();
    
    connect(worker, &HttpRequestWorker::on_execution_finished, this, &MainWindow::http_request_result);
    worker->execute(input);
    pIndicator->startAnimation();
}

2.login process,get session token and save it,after that,application will create a new window to show the vm and it's status()
void MainWindow::http_request_result(HttpRequestWorker *worker)
{
    char *response = worker->response.data();

    QJsonParseError json_error;
    QJsonDocument parse_doucment = QJsonDocument::fromJson(worker->response, &json_error);

    //qDebug(response);

    if(json_error.error == QJsonParseError::NoError)
    {

        if(parse_doucment.isObject())
        {
            QJsonObject obj = parse_doucment.object();
            if(obj.contains("error"))
            {
                //if get error code,application will show it and process of login will stop
                QJsonValue error = obj.take("error");
                if(error.isString())
                {
                    QString _error = error.toString();
                    QJsonValue errorCode = obj.take("error_code");
                    QString _errorCode = errorCode.toString();
                    //restAPIResultLabel->setText("  错误代码:" + _errorCode + "  错误内容:" + _error);
                    QMessageBox::about(NULL, "登录失败,请检查您的账号密码是否正确!", "  错误代码:" + _errorCode + "  错误内容:" + _error);
                }
            }else if(obj.contains("access_token"))
            {
                QJsonValue access_token = obj.take("access_token");
                if(access_token.isString())
                {
                    QString _access_token = access_token.toString();
                    Utils::setCurrentToken(_access_token);

                    if(Utils::getEnableUsername() == "1"){

                    }else{
                        ui->usernamelineEdit->setText("");
                        ui->passwordlineEdit->setText("");
                        restAPIResultLabel->setText("");
                    }


                    //5分钟后自动退出主界面回到登陆界面
                    //!!!!!!!!!!VM Status refresh thread!!!!!!!!!!!!!
                    //There is a application bug in my production environment,the vm status refresh process will go on forever,even if the token become invalid....I can't modify it immediately.*_* 100 user use it now....
                    seesionStatusUpdate = new SessionStatusUpdateThread();
                    connect(seesionStatusUpdate, SIGNAL(sessionStatusUpdate(bool)), this, SLOT(sessionStatusUpdateSolts(bool)),Qt::DirectConnection);
                    seesionStatusUpdate->start();


                    //登陆成功
                    //login success,open a vm list window,please refer to vmlist.png
                    vmListWindow = new VMListWindow();

                    connect(vmListWindow, VMListWindow::sessionExit, this, MainWindow::sessionExit);

                    vmListWindow->show();
                    this->hide();


                }
            }
        }
    }

    pIndicator->stopAnimation();
}

Comment 12 seesky 2017-11-24 01:05:14 UTC
(In reply to Yaniv Kaul from comment #4)
> Can you also share engine.log?

I think it shoud be a warning if the token become invalid,not a error

Comment 13 Martin Perina 2017-11-24 13:51:21 UTC
Unfortunately engine.log, server.log and error_log contains records from different time intervals, so could you please provides us logs from the same time interval (I'm not able to correlate Apache restart with records from engine and server logs at the same time)? Also it would be helpful if we can we can see logs even prior to the issue.

Are there any other applications running on your Apache? Isn't you network interface flooded by accessing 100 VMs through single network interface, while you are also communicating with engine on the same interface?

Comment 14 Juan Hernández 2017-11-24 14:20:08 UTC
It isn't clear from the code that you shared, but may it be that your clients are opening the HTTP connection and leaving it open? It is easy and common to forget to close it. If so you may be exhausting the connections that the web server is able to process. By default Apache accepts up to 256 simultaneous connections. This can be changed in the configuration, but I think you don't really need more than 256 simultaneous connections.

I'd suggest first to check what is the actual number of connections that you have to your web server:

  # ss --tcp --numeric --processes | grep :443

Then I'd suggest to make sure that your application does this:

1. Send the initial request to get the SSO token, and save it in memory.

2. Send the requests to find the identifier of the virtual machine, and save it in memory as well. For these two steps you can use the same connection, but then you should close it. Keeping it open 5 seconds till you need it again is a waste of resources: you are blocking one of the Apache web server processes for nothing.

3. Whenever you need to check the status of the virtual machine create a new connection, send the request for that specific virtual machine only, using the token and virtual machine identifiers that you have in memory. Then make sure to close the connection.

This should ensure that you only keep the connections open when they are really needed.

Also, I'd suggest you to consider increasing that 5 seconds period, seems too short to me. Why do you need to check so frequently?


Note You need to log in before you can comment on or make changes to this bug.